[HN Gopher] Language-Agnostic Bert Sentence Embedding ___________________________________________________________________ Language-Agnostic Bert Sentence Embedding Author : theafh Score : 47 points Date : 2020-08-18 19:50 UTC (3 hours ago) (HTM) web link (ai.googleblog.com) (TXT) w3m dump (ai.googleblog.com) | lacker wrote: | After all the attention OpenAI got for making the GPT-3 API semi- | publicly-available, I wonder if Google has considered making some | of their research models available via API. It would be pretty | neat to be able to try these things out, rather than just reading | papers about them. | MiroF wrote: | You've actually got it backwards. | | Releasing model weights has been common for a long time, it was | OpenAI that regressed and refused to release its GPT models | until it released GPT-3 hidden behind a semi-public API. The | google BERT models are released pre-trained in full (as | mentioned in the article), you can easily play around with them | on your own. | | BERT isn't a forward LM like GPT, so it's a little less easy to | play with for the uninitiated - although there are papers | showing text generation with BERT. | tmabraham wrote: | To be fair, IIRC OpenAI did release GPT-2-large after some | community pressure, and it was somewhat doable for some | people to actually train from scratch. GPT-3 is too large, so | even if they released it, nobody apart from large companies | like Google could do anything with it. If anything, they've | made GPT-3 more open than it would have been if they just | released the weights. | | At least that's my understanding. Feel free to correct me if | I'm wrong. | Der_Einzige wrote: | They only released GPT-2 large after other language models | came out that were even larger, like T5... | liuliu wrote: | I don't know. It has 175 billion parameters, thus, about | 500GiB for floating point parameters. If these are | bfloat16, it is mere 300GiB. We know CPU is about 50 times | slower than GPU for transformer models, hence, we are | looking at 4 to 8 minutes per inference (parameter loading | on demand from SSD takes 200 seconds or so, and probably | the bottleneck here). If the parameters can be loaded into | memory (seems you have to be on a >= 8-channel machine with | unbuffered RAM or with buffered RAM), it will be probably 1 | to 2 minutes per inference. | | Still in the realm of doing inference in homelab territory | (barely). | MiroF wrote: | I'm not really up in arms about OpenAIs decision, but I | just don't want people to frame this as "why can't more | companies release models like OpenAI does" when in reality | the opposite is pretty much true. | | GPT-2 wasn't really feasible for most actors to train from | scratch, otherwise it would have been released by a third | party. GPT-3 is technically feasible for a single actor to | do inference with, although not really really. | tmabraham wrote: | I agree with your comment here but I think for the type | and size of model, OpenAI has done the best they could | with such large models. So I wouldn't say they've | "regressed and refused to release its GPT models" they | just had to take a different route with GPT models. | lettergram wrote: | I firmly believe what OpenAI is actually doing is building a | dataset. | chaoz_ wrote: | However, @lacker makes a valid point that most research | papers lack interactive, well-designed API for people | unfamiliar with the domain. | colordrops wrote: | It's really ironic that they are regressing and keeping their | research behind a curtain when the entire pretense for their | existence (and their name) is to open things up. | taneq wrote: | It's not like it's a new across-the-board policy for them. | They were just being cautious in this particular case | because they were worried about social ramifications. | Arguably some other research should have been made less | readily available like this (think deepfakes for example). | est31 wrote: | It's easier to hire people for a (claimed) mission driven | company than for business evil corp that builds killer | machines for the us military. ___________________________________________________________________ (page generated 2020-08-18 23:00 UTC)