[HN Gopher] GPT-Neo - Building a GPT-3-sized model, open source ...
       ___________________________________________________________________
        
       GPT-Neo - Building a GPT-3-sized model, open source and free
        
       Author : sieste
       Score  : 638 points
       Date   : 2021-01-18 09:13 UTC (13 hours ago)
        
 (HTM) web link (www.eleuther.ai)
 (TXT) w3m dump (www.eleuther.ai)
        
       | rexreed wrote:
       | Once Microsoft became the sole and exclusive licensee of GPT-3, a
       | credible open source effort was bound to emerge, and I hope it
       | really is an effective alternative.
        
         | fakedang wrote:
         | Eleuther works with Google. I'd rather prefer the Microsoft
         | demon rather than the Google Demon.
        
           | jne2356 wrote:
           | The resources for the GPT-3 replication will be provided by
           | the cloud company CoreWeave.
        
       | kashyapc wrote:
       | Incidentally, over the weekend I listened to a two-hour-long
       | presentation[1] by the cognitive scientist, Mark Turner[2]. The
       | talk's main goal was to explore the question _" Does cognitive
       | linguistics offer a way forward for second-language teaching?"_
       | 
       | During the Q&A, Turner explicitly mentions GPT-3 (that's when I
       | first heard of it) as a "futuristic (but possible) language-
       | learning environment" that is likely to be a great boon for
       | second-language learners. One of the appealing points seems to be
       | "conversation [with GPT-3] is not scripted; it keeps going on any
       | subject you like". Thus allowing you to simulate some bits of the
       | gold standard (immersive language learning in the real world).
       | 
       | As an advanced Dutch learner (as my fourth language), I'm curios
       | of these approaches. And glad to see this open source model. (It
       | is beyond ironic that the so-called "Open AI" turned out to have
       | dubious ethics.)
       | 
       | [1] https://www.youtube.com/watch?v=A4Q977p8PfQ
       | 
       | [2] Check out the _excellent_ book he co-authored, _" Clear and
       | simple as the truth"_--it has valuable insights on improving
       | writing based on some robust research.
        
       | Jack000 wrote:
       | Hope there will be a distilled or approximate attention version
       | so it can be run on consumer gpus.
        
       | wumms wrote:
       | https://github.com/EleutherAI/gpt-neo (Couldn't find it on the
       | website)
        
       | matteocapucci wrote:
       | Do you know how can one donate?
        
       | newswasboring wrote:
       | The intention behind it is pretty good. Best of luck to them.
       | 
       | I wonder if I can donate computing power to this remotely. Like
       | the old SETI or protein folding things. Use idle CPU to calculate
       | for the network. Otherwise the estimates I have seen on how much
       | it would take to train these models are enormous.
        
         | mryab wrote:
         | Not directly related, but the Learning@home [1] project aims to
         | achieve precisely that goal of public, volunteer-trained neural
         | networks. The idea is that you can host separate "experts," or
         | parts of your model (akin to Google's recent Switch
         | Transformers paper) on separate computers.
         | 
         | This way, you never have to synchronize the weights of the
         | entire model across the participants -- you only need to send
         | the gradients/activations to a set of peers. Slow connections
         | are mitigated with asynchronous SGD and unreliable/disconnected
         | experts can be discarded, which makes it more suitable for
         | Internet-like networks.
         | 
         | Disclaimer: I work on this project. We're currently
         | implementing a prototype, but it's not yet GPT-3 sized. Some
         | issues like LR scheduling (crucial for Transformer convergence)
         | and shared parameter averaging (for gating etc.) are tricky to
         | implement for decentralized training over the Internet.
         | 
         | [1] https://learning-at-home.github.io/
        
           | echelon wrote:
           | Do you have a personal Twitter account I can follow? Your
           | career is one I'd like to follow.
        
         | teruakohatu wrote:
         | TensorFlow supports distributed training with a client server
         | model.
        
           | newswasboring wrote:
           | Does it also solve the problem of everyone having different
           | hardware?
        
             | londons_explore wrote:
             | It does.
             | 
             | For most models, your home broadband would be far too slow
             | though.
        
               | emteycz wrote:
               | What about some kind of sharding, parts of the
               | computation that could be executed in isolation for a
               | longer period of time?
        
               | Filligree wrote:
               | An ongoing research problem. OpenAI would certainly like
               | being able to use smaller GPUs, instead of having to fit
               | the entire model into one.
        
               | jne2356 wrote:
               | GPT-3 does not fit in any one GPU that exists at present.
               | It's already spread out across multiple GPUs.
        
               | newswasboring wrote:
               | Is it because they will have to communicate back errors
               | during training? I forgot that training these models is
               | more of a global task than proteins folding. In that
               | sense this is less parallelizable over the internet.
        
               | londons_explore wrote:
               | Yes, and also activations if your GPU is too small to fit
               | the whole model. The minimum useful bandwidth for that
               | stuff is a few gigabits...
        
         | jne2356 wrote:
         | They get this suggestion a lot. There's a section in their FAQ
         | that explains why it's infeasible.
         | 
         | https://github.com/EleutherAI/info
        
         | chillee wrote:
         | The primary issue is that large scale GPU training is primarily
         | dominated by communication costs. Since to some approximation
         | things need to be synchronized after every gradient update, it
         | becomes very quickly quite infeasible to increase the
         | communication cost.
        
         | danwills wrote:
         | Yeah! sounds great, if I could easily run a SETI-at-home style
         | thing to contribute to training a (future) model similar to
         | GPT-x, but with the result freely available to play with, I
         | reckon i'd do it. Could even be made into a game I'd say! I am
         | totally aware that gpt3 itself can't be run on a regular
         | workstation, but maybe hosting for instances of models from the
         | best/most-interesting training runs could be worked out by
         | crowd-funding?
        
           | colobas wrote:
           | I was just gonna propose something like this. Democratizing
           | large ML models.
        
       | hooande wrote:
       | In my experience, the output from GPT-3, DALL-E, et al is similar
       | to what you get from googling the prompt and stitching together
       | snippets from the top results. These transformers are trained on
       | "what was visible to google", which provides the limitation on
       | their utility.
       | 
       | I think of the value proposition of GPT-X as "what would you do
       | with a team of hundreds of people who can solve arbitrary
       | problems only by googling them?". And honestly, not a lot of
       | productive applications come to mind.
       | 
       | This is basically the description of a modern content farm. You
       | give N people a topic, ie "dating advice", and they'll use google
       | to put together different ideas, sentences and paragraphs to
       | produce dozens of articles per day. You could also write very
       | basic code with this, similar to googling a code snippet and
       | combining the results from the first several stackoverflow pages
       | that come up (which, incidentally, is how I program now). After a
       | few more versions, you could probably use GPT to produce fiction
       | that matches the quality of the average self published ebook. And
       | DALL-E can come up with novel images in the same way that a
       | graphic designer can visually merge the google image results for
       | a given query.
       | 
       | One limitation of this theoretical "team of automated googlers"
       | is that what they search for content is cached on the date of the
       | last GPT model update. Right now the big news story is the Jan
       | 6th, 20201 insurrection at the US Capitol. GPT-3 can produce
       | infinite bad articles about politics, but won't be able to say
       | anything about current events in real time.
       | 
       | I generally think that GPT-3 is awesome, and it's a damn shame
       | that "Open"AI couldn't find a way to actually be open. At this
       | point, it seems like a very interesting technology that is still
       | in desperate need of a Killer App
        
         | devonkim wrote:
         | The current problem is that we don't have a reliable, scalable
         | way to merge in features of knowledge engines that have
         | ontological relationships of entities with generative engines
         | that are good for making more natural looking or sounding
         | qualitative output. There's certainly research going on to join
         | them together but it's just not getting the kind of press
         | releases as the generative and pattern recognition stuff that's
         | much easier comparatively. The whole "General AI Complete"
         | class of problems seems to be ones that are trying to combine
         | multiple areas of more specific AI systems but that's exactly
         | where more practical problems for the average person arise.
        
           | glebshevchuk wrote:
           | Agreed, but that's because they're hard to integrate
           | together: one is concerned with enumerating over all facts
           | that humans know about (a la Cyc) and the other is concerned
           | with learning those directly from data. Developing feedback
           | systems that combine these two would be quite exciting.
        
         | dnautics wrote:
         | > And honestly, not a lot of productive applications come to
         | mind
         | 
         | So, can't go into too many details, since I haven't started
         | yet, I'm thinking about mixing a flavor of GPT with DETR for
         | OCR tasks where the model then must predict categorization
         | vectors, the chief difficulty of the task being that it must
         | identify and classify arbitrary length content in the OCR.
        
         | visarga wrote:
         | > honestly, not a lot of productive applications come to mind
         | 
         | Not so convincing when you enumerate so many applications
         | yourself.
         | 
         | > but won't be able to say anything about current events
         | 
         | There are variants that use transformer + retrieval, so they
         | got unlimited memory that can be easily extended.
        
           | gxqoz wrote:
           | I've mentioned this in another thread, but a GPT-3 that could
           | reliably generate quizbowl questions like the ones on
           | https://www.quizbowlpackets.com would be great in this
           | domain. My experience with it indicates it's no where near
           | being able to do this, though.
        
           | pydry wrote:
           | Content farms are hardly a productive application.
        
             | visarga wrote:
             | You missed the forest for the trees. If you got a tool that
             | can use StackOverflow to solve simple programming tasks, or
             | to generally solve any simple task with Google, then you're
             | sitting on a gold mine.
        
               | notahacker wrote:
               | That's a big _if_ though.
               | 
               | GPT-3 is much more _interesting autocomplete based on
               | most commonly used patterns_ than something which figures
               | out that Problem X has a lot of conceptual similarities
               | with Solved Problem Y so it can just reuse the code
               | example with some different variable names.
        
               | jokethrowaway wrote:
               | Yes and no.
               | 
               | It may be useful to hire less low skilled employees and
               | keep a few senior ones that take input from machine and
               | decide what to keep and what to throw away. I'm not sure
               | if a senior engineer would be more productive patching up
               | code written by a bot or writing it from scratch. It's
               | going to be a hard sell while you still need human
               | supervisors.
               | 
               | You can't trust a machine that can't reason with code
               | implementation, or even content creation. You need a
               | human to supervise or a better machine.
               | 
               | We already have AI based auto-completion for code, gpt-3
               | can be useful for that (but at what cost? Storing a huge
               | model on your disk or making a slow / unsafe http request
               | to the cloud?)
        
               | polynomial wrote:
               | > if a senior engineer would be more productive patching
               | up code written by a bot or writing it from scratch.
               | 
               | I have no doubt writing from scratch would win hands
               | down. The main reason we patch wonky legacy code is
               | because it's already running and depended on. If you
               | remove that as a consideration, a senior engineer writing
               | the equivalent code (rather than debugging code generated
               | randomly from Google searches) would -IMO- would be more
               | efficient and produce a higher quality program.
        
         | schaefer wrote:
         | I feel your stance [1] is demonstrably false in two challenges.
         | 
         | 1) Please play a winning game of Go against Alpha Zero, just by
         | googling the topic.
         | 
         | 2) Next please explain how Alpha Zero's game's could forever
         | change Go opening theory[2], without any genuine creativity.
         | 
         | [1] that "the output from GPT-3, DALL-E, et al is similar to
         | what you get from googling the prompt and stitching together
         | snippets from the top results."
         | 
         | [2]"Rethinking Opening Strategy: AlphaGo's Impact on Pro Play"
         | by Yuan Zhou
        
           | ravi-delia wrote:
           | Op was clearly not talking about Alpha Zero, a different
           | technology made by different people for a different purpose.
           | Instead, they were noting that despite displaying some truly
           | excellent world modeling, GPT-3 is _trained_ on data that
           | encourages it to vomit up rehashes. It 's very possible that
           | the next generation will overcome this and wind up completely
           | holding together long-run concepts and recursion, at least if
           | scaling parameters keeps working, but for now it is a real
           | limitation.
           | 
           | GPT-3 writes like a sleepy college student with 30 minutes
           | before the due date; with shockingly complete grasp of
           | _language_ , but perhaps not complete understanding of
           | content. That's not just an analogy, I am a sleepy college
           | student. When I write an essay without thinking too hard it
           | displays exactly the errors that GPT-3 makes.
        
           | tbenst wrote:
           | GPT-3 can't play Go.
        
             | Tenoke wrote:
             | It almost definitely can to some extent given that gpt2
             | could play chess [0].
             | 
             | 0. https://slatestarcodex.com/2020/01/06/a-very-unlikely-
             | chess-...
        
         | rich_sasha wrote:
         | I think this is exactly right, and indeed this is a lot of the
         | value. "Content-generation" is already a thing, and yes it
         | doesn't need to make much sense. Apparently people who read it
         | don't mind.
        
           | throwaway6e8f wrote:
           | People don't read it, search engines do.
        
             | masswerk wrote:
             | BTW, we should mandatorily tag generated content for search
             | engines in order to exclude it from future training sets.
        
               | hundchenkatze wrote:
               | Apart from that, hopefully the people building training
               | sets use gltr or something similar to prevent training on
               | generated text.
               | 
               | http://gltr.io/
        
         | ape4 wrote:
         | Or not even googling, but pre-googling. Using its predictive
         | typing in the text box at google.com Because you are giving to
         | something to complete.
        
         | Lucasoato wrote:
         | > I think of the value proposition of GPT-X as "what would you
         | do with a team of hundreds of people who can solve arbitrary
         | problems only by googling them?". And honestly, not a lot of
         | productive applications come to mind.
         | 
         | Damn, this could replace so many programmers, we're doomed!
        
           | throwaway_6142 wrote:
           | *realizing GPT-3 was probably created by programmers who's
           | job really is mostly googling for stackoverflow answers*
           | 
           | #singularity
        
             | crdrost wrote:
             | We were worried that the singularity was going to involve
             | artificial intelligences that could make themselves
             | smarter, and we were underwhelmed when it turned out to be
             | neural networks that started to summarize Stack Overflow
             | neural network tips, to try to optimize themselves,
             | instead.
             | 
             | GPT-[?]: still distinguishable from human advice, but it
             | contains a quadrillion parameters and nobody knows how
             | exactly it's able to tune them.
        
           | yowlingcat wrote:
           | Curses. We've been found out.
        
         | napier wrote:
         | >"what would you do with a team of hundreds of people who can
         | solve arbitrary problems only by googling them?"
         | 
         | What would you do with a team of hundreds of people who can
         | instantly access an archive comprising the sum total of
         | digitized human knowledge and use it to solve problems?
        
           | hooande wrote:
           | We have that now, it's called googling. you could easily hire
           | 100 people to do that job, but you'd have to pay them at
           | least $15/hr now on the US. Say equivalent gpt-3 servers cost
           | a fraction of that. How do you make money with that resource?
        
           | eru wrote:
           | Well, they can use it to write text. Not to solve problems
           | directly.
        
         | orm wrote:
         | GPT-3 is trained on text prediction, and there's been a lot of
         | commentary about the generation aspect, but some of the
         | applications that excite me most are not necessarily about the
         | generation of text, instead, GPT-3 (and also other language
         | models) create very useful vector representations of natural
         | language as a side effect that can then be used for other tasks
         | with much less data, or with too much extra data. Using the
         | text prediction task as a way to supervise learning this
         | representation without having to create an expensive labelled
         | dataset is very helpful, and not just to language tasks. See
         | for example the CLIP work that came out recently for image
         | classification, using GPT-3 and captions to supervise training.
         | There is other work referred to in that blog post that also
         | exploits captions or descriptions in natural language to help
         | understand images better. More speculatively, being able to use
         | natural language to supervise or give feedback to automated
         | systems that have little to do with NLP seems very very useful.
        
           | hooande wrote:
           | "More speculatively, being able to use natural language to
           | supervise or give feedback to automated systems that have
           | little to do with NLP seems very very useful."
           | 
           | I agree with this, and it isn't entirely speculative. One of
           | the most useful applications I have seen that goes beyond
           | googling is generating css using natural language. ie,
           | "change the background to blue and put a star at the top of
           | the page". There are heavily sample selected demos of this on
           | twitter right now https://twitter.com/sharifshameem/status/12
           | 82676454690451457...
           | 
           | This is definitely practical, though I wouldn't design my
           | corporate website using this. could be useful if you need to
           | make 10 new sites a day for something with seo or domains
        
         | theonlybutlet wrote:
         | I work in a niche sector of the insurance industry. Based on
         | what it can already do, I can see it doing half my job with
         | basically no learning curve for the user. Based on this alone,
         | I could see it reducing headcount in the company and sector by
         | 5%. This is massive when you consider the low margins in the
         | industry and high costs of "skilled" staff.
        
         | mistermann wrote:
         | > I think of the value proposition of GPT-X as "what would you
         | do with a team of hundreds of people who can solve arbitrary
         | problems only by googling them?". And honestly, not a lot of
         | productive applications come to mind.
         | 
         | If I was Xi Jinping, I would use it to generate arbitrary
         | suggestions _for consideration by_ my advisory team, as I
         | develop my ongoing plan for managing The Matrix.
        
         | darepublic wrote:
         | It's not even commercially available afaik, it's only been made
         | available to _some_ developers who applied for access, through
         | a completely opaque process-- this after soliciting for
         | application ideas from every applicant. So I heartily support
         | finding a way to truly open source it.
        
         | comboy wrote:
         | Have you tried conversing with it, after a few lines of setting
         | a proper context? Like two scientist talking or something like
         | that. It can provide very interesting outputs that are not
         | googlable.
         | 
         | Yes, every time you see something that for human obviously
         | doesn't make sense it makes you dismiss it. You would look at
         | that output differently though if you were talking with a
         | child. Just like a child can miss some information making it
         | say something ridiculous it may miss some patterns connections.
         | 
         | But have you ever observed carefully how we connect patterns
         | and make sentences? Our highly sophisticated discussions and
         | reasoning is just pattern matching. Then most prominent
         | patterns ordered in time also known as consciousness.
         | 
         | Watch hackernews comments and look how after somebody used a
         | rare adjective or cluster of words more commenters tend to use
         | it without even paying conscious attention to that.
         | 
         | Long story short, give it a try and see what examples of what
         | people already did with it even in it's limited form.
         | 
         | To me you are looking at an early computer and saying that it's
         | not doing anything that a bunch of people with calculators
         | couldn't do.
        
         | tda wrote:
         | Did anyone try to ask GPT-3 something about the insurrection? I
         | can't find any free/open text prompts for GPT-3, this is what
         | GPT-2 has to say:
         | 
         |  _After Trump lost the 2020 election his supporters stormed the
         | Capitol_
         | 
         | > They smashed windows, tore apart the Republican Party's
         | headquarters, destroyed a statue of Ronald Reagan and set fire
         | to the Texas statehouse.
         | 
         | > A month later, after the Democrats took control of the Senate
         | and the House, Trump's supporters stormed their own offices.
         | They attacked the offices of California Democratic Senator
         | Kamala Harris and New York Democratic Representative Joe
         | Crowley. They ransacked offices in the Minnesota and Texas
         | statehouses and burned down the Republican Party headquarters
         | in Tennessee. The damage was estimated at
        
           | throwaway_6142 wrote:
           | my god it should be called CNN-2
        
           | notahacker wrote:
           | You can play with GPT3 in a custom world at AIdungeon.io The
           | responses are biased towards giving you RPG second person
           | narrative, but the corpus of data, mastery of syntax and more
           | uncertain grasp of events and relationships is all there
           | 
           | Example with the prompt _You are Donald Trump. The recent
           | election results have been a disappointment to you_.
           | 
           | https://pastebin.com/dSYZypCw
           | 
           | Props for turns of phrases like "Your opponent is a typical
           | liberal. He hails from the right wing of the Democratic
           | Party, but has been trying to appeal to the left to gain more
           | support.", but poor marks for apparently not having grasped
           | how elections work. (There's a joke in there somewhere)
           | 
           | If you don't pick a custom world and your own prompt, you get
           | something more like this:
           | 
           | > You are Donald Trump, a noble living in the kingdom of
           | Larion. You are awakened by a loud noise outside the gate.
           | You look out the window and see a large number of orcish
           | troops on the road outside the castle.
           | 
           | I'd like 'orcish troops' better if I thought it was inspired
           | by media reports of Capitol events rather than a corpus of
           | RPGs.
        
           | visarga wrote:
           | GPT-3
           | 
           | > The Trump supporters were armed with guns and knives. They
           | were also carrying torches. They were chanting "Trump 2020"
           | and "Build the Wall." The Trump supporters were also chanting
           | "Lock her up." But they were referring to Hillary Clinton.
        
             | danmur wrote:
             | This is hilarious, more please :)
        
         | dvfjsdhgfv wrote:
         | In science, some amazing discoveries are made years or even
         | centuries before some practical applications for them are
         | found. I believe in humanity, sooner or later we'll find some
         | actually useful applications for GPT-X.
        
           | Filligree wrote:
           | It's a wonderful co-writer of fiction, for one. Maybe the
           | better authors wouldn't need it, but as for everyone else --
           | take a look at
           | https://forums.sufficientvelocity.com/threads/generic-
           | pawn-t..., and compare the first couple of story posts to the
           | last few.
           | 
           | One of the ways in which people get GPT-3 wrong is, they give
           | it a badly worded order and get disappointed when the output
           | is poor.
           | 
           | It doesn't work with orders. It takes a lot of practice to
           | work out what it does well with. It always imitates the
           | input, and it's great at matching style -- and it knows good
           | writing as well as bad, but it can't ever write any better
           | than the person using it. If you want to write a good story
           | with it, you need to already be a good writer.
           | 
           | But it's wonderful at busting writer's block, and at writing
           | _differently_ than the person using it.
        
         | tigerBL00D wrote:
         | I don't necessarily see the "team of automated googlers" as a
         | fundamental or damning problem with GPT-like approaches. First
         | I think people may have a lot fewer truly original ideas then
         | they are willing to admit. Original thought is sought after and
         | celebrated in arts as a rare commodity. But unlike in arts,
         | where there are almost not constraints, when it comes to
         | science or engineering almost every incremental step is of form
         | Y = Fn(X0,..,Xn) where X0..Xn are widely known and proven to be
         | true. With sufficient logical reasoning and/or experimental
         | data, after numerous peer reviews, we can accept Fn(...) to be
         | a valid transform and Y becomes Xn+1, etc. Before internet or
         | Google one had to go to a library and read books and magazines,
         | or ask other people to find inputs from which new ideas could
         | be synthesized. I think GPT-like stuff is a small step towards
         | automating and speeding up this general synthesis process in
         | the post-Google world.
         | 
         | But if we are looking to replace end-to-end intelligence at
         | scale it's not just about synthesis. We need to also automate
         | the peer review process so that it's bandwidth is matched to
         | increased rate of synthesis. Most good researchers and
         | engineers are able to self-critique their work (and the degree
         | to which they can do that well is really what makes one good
         | IMHO). And then we rely on our colleagues and peers to review
         | our work and form a consensus on its quality. Currently GPT-
         | like systems can easily overwhelm humans with such peer review
         | requests. Even if a model is capable of writing the next great
         | literary work, predicting exactly what happened on Jan 6, or
         | formulating new laws of physics the sheer amount of crap it
         | will produce alongside makes it very unlikely that anyone will
         | notice.
        
           | Keyframe wrote:
           | "team of automated googlers" where google is baked-in. Google
           | results, and content behind it, changes. Meaning, GPT would
           | have to be updated as well. Could be a cool google feature, a
           | service.
        
           | breck wrote:
           | I call it the "Prior-Units" theorem. Given that you are able
           | to articulate an idea useful to many people, there exists
           | prior units of that idea. The only way then to come up with a
           | "new idea", is to come up with an idea useful only to
           | yourself (plenty of those) (or small groups), or translate an
           | old idea to a new language.
           | 
           | The reason for this is that if your adult life consists of
           | just a tiny, tiny, tiny fraction of the total time of all
           | adults, and so if an idea is relevant to more people, odds
           | decrease exponentially that no one thought of it before.
           | 
           | There are always new languages though, so a great strategy is
           | to take old ideas and bring them to new languages. I count
           | new high level, non programming languages as new languages as
           | well.
        
           | PaulHoule wrote:
           | Art (music, literature, ...) involves satisfaction of
           | constraints. For instance you need to tune your guitar like
           | the rest of the band, write 800 words like the editor told
           | you, tell a story with beginning, middle, and end and
           | hopefully not use the cheap red pigments that were
           | responsible for so many white, blue, and gray flags I saw in
           | December 2001.
        
       | anticristi wrote:
       | I love the initiative, but I'm starting to get scared of what a
       | post-GPT-3 world will look like. We are already struggling to
       | distinguish fake news from real ones, automated customer request
       | replies from genuine replies, etc. How will I know that I have a
       | conversation with a real human in the future?
       | 
       | On the other side, the prospect of having an oracle that answers
       | all trivia, fixes spelling and grammar, and allows humans to
       | focus on higher level information processing is interesting.
        
         | visarga wrote:
         | > How will I know that I have a conversation with a real human
         | in the future?
         | 
         | This problem should be solved with cryptography, not by banning
         | large neural nets.
        
         | namelosw wrote:
         | It's likely to be bad, such as:
         | 
         | Massively plagiarize articles and the search engine probably
         | have no way to identify which is the original content. It's
         | like to rewrite everything on the internet using your own
         | words, this may lead to the internet filled with this kind of
         | garbage.
         | 
         | Reddit and platforms alike filled with bots say bullshits all
         | the time but hard to identify by the human in the first place
         | (current model is pretty good at generating metaphysical
         | bullshits, but rarely insightful content). People may be
         | surrounded by bot bullshitters and trolls, and very few of them
         | are real.
         | 
         | Scams at larger scales. The skillset is essentially like
         | customer service plus bad intentions. With new models, scammers
         | can do their things at scale and find qualified victims more
         | efficiently.
        
           | pixl97 wrote:
           | >(current model is pretty good at generating metaphysical
           | bullshits, but rarely insightful content)
           | 
           | Wait, are we talking about bots posting crap, or the average
           | political discussion?
        
         | gianlucahmd wrote:
         | We already live in a post-GPT3 world, but one where all its
         | power is the hands of a private company.
         | 
         | The conversation needs to move on whether making it open and
         | democratic is a good idea, but the tech itself is here to stay.
        
         | dkjaudyeqooe wrote:
         | I know! One day it's going to get so bad people are going to
         | have to deploy critical thinking instead of accepting what they
         | read at face value and suffer the indignity of having to think
         | for themselves.
        
           | _underfl0w_ wrote:
           | Maybe that'll also be the Year of the Linux desktop I keep
           | hearing so much about.
        
           | Lewton wrote:
           | Will they also learn to avoid magical thinking like "People
           | en masse will all of a sudden develop abilities out of the
           | blue"?
        
           | Kuinox wrote:
           | Sadly they don't. They start to believe to random things in
           | and reject what they don't like.
        
           | gmueckl wrote:
           | Critical thinking won't help you when the majority (or all)
           | of your sources are tainted and contradictory. At some point,
           | the actual truth just gets swamped.
        
             | inglor_cz wrote:
             | This. Robots can spout 1000x more content than humans, if
             | not more.
        
             | spion wrote:
             | This is already happening, just with humans networked into
             | social networks that favor quick reshare over deep review
        
             | thelastwave wrote:
             | "the actual truth"
        
           | hooande wrote:
           | 10,000 years of human civilization and this hasn't happened
           | yet, huh? Any day now, I'm sure
        
           | emphatizer2000 wrote:
           | Maybe somebody could create an AI that evaluates the
           | factfulness of articles.
        
             | visarga wrote:
             | This is possible, especially with human in the loop.
        
             | lrossi wrote:
             | Or have the AI generate only fact-based, polite and
             | relevant comments.
             | 
             | Related xkcd: https://xkcd.com/810/
        
           | realusername wrote:
           | I'm going to throw some wild guess here and say that this
           | sudden increase in critical thinking won't happen.
        
         | Erlich_Bachman wrote:
         | Photoshop has existed for decades. Is it really that big of a
         | problem for photo news?
        
           | technocratius wrote:
           | The difference between Photoshop and generative models is not
           | in what it can technically achieve, but the cost of achieving
           | the desired result. Fake news photo or text generation is
           | possible by humans, but scales poorly compared (more humans)
           | to a algorithmically automated process (some more compute).
        
           | draugadrotten wrote:
           | Yes!
           | 
           | https://en.wikipedia.org/wiki/Adnan_Hajj_photographs_controv.
           | .. https://www.bbc.com/news/world-asia-china-55140848
           | 
           | and many more
        
             | Agentlien wrote:
             | Wow, that first Adnan Hajj photograph looks absolutely
             | terrible.
        
           | danielscrubs wrote:
           | Touch ups where done before photoshop but now it's ALWAYS
           | done. The issues this has created in society might have a
           | bigger emotional impact than we give it credit for.
           | 
           | Regarding photo news there has been quite a lot of scandals
           | to the point that I'd guess the touchups is more or less
           | accepted.
        
             | Pyramus wrote:
             | I conducted a workshop in media compentency for teenage
             | girls and one of the key learnings was that _every_ image
             | of a female subject they encounter in printed media (this
             | was before Instagram) has been retouched.
             | 
             | To hammer the point home I let them retouch a picture by
             | themselves to see what is possible even for a completely
             | untrained manipulator.
             | 
             | It was eye-opening - one of the things that should
             | absolutely be taught in school but isn't.
        
               | IndySun wrote:
               | "one of the things that should absolutely be taught in
               | school but isn't."
               | 
               | Namely, critical thinking?
        
               | darkwater wrote:
               | I don't think "critical thinking" is the point here.
               | Because first you need to know that such modifications
               | CAN be done. And not everybody knows what can be
               | retouched with PS or programs. So yeah, if you see some
               | super-model on a magazine cover, and you don't know PS
               | can edit photos easily, it would be not that immediate to
               | think "hey maybe that's not real!".
               | 
               | As an extreme example: would you ever checked 20 years
               | ago a newspaper text to know if it was generated by an AI
               | or by a human? Obviously no, because you didn't know of
               | any AI that could do that.
        
               | Pyramus wrote:
               | Exactly this.
               | 
               | There is a secondary aspect of becoming aware that
               | society has agreed on beauty standards (different for
               | different societies) and PS being used as a means to
               | adhere to these standards.
        
               | IndySun wrote:
               | I think I made my point badly because I also agree.
               | 
               | I am lamenting that teenagers were, in this day and age,
               | surprised at what can be done with Photoshop. And that
               | let loose on the appropriate software were surprised at
               | what can be altered and how easily.
               | 
               | My point is suggesting this may be so because people have
               | not been taught how to think for themselves and accept
               | things (in this case female images) 'as is', without a
               | hint of curiosity. It is also a problem but at the other
               | end of the stick, with many young people I work with
               | considering Wikipedia to be 100% full of misinformation
               | and fake news.
        
           | Rastonbury wrote:
           | The concern is not so much AI generated news, but malicious
           | actors misleading, influencing or scamming people online at
           | scale with realistic conversations. Today we already have
           | large scale automated scams via email and robo call. Less
           | scalable scams like Tinder love catfish scams or
           | Russian/China trolls on reddit are now run by real people,
           | imagine it being automated. If human moderators cannot
           | distinguish these bots from real humans, that is a scary
           | thought, imagine not being able to tell if this comment was
           | written by a human or robot.
        
             | hooande wrote:
             | why does this matter? the internet is filled with millions
             | of very low quality human generated discussions right now.
             | There might not be much of a difference between thousands
             | of humans generating comment spam and thousands of gpt-3
             | instances doing the same
        
               | mekkkkkk wrote:
               | It does matter. The nice feeling of being one of many is
               | a feature of echo chambers. If you can create that
               | artificially for anything with a push of a button, it's a
               | powerful tool to edit discourse or radicalize people.
               | 
               | Have a look at Russias interference in the previous US
               | election. This is what they did, but manually. To be able
               | to scale and automate it is huge.
        
               | ccozan wrote:
               | But careful, the human psyche has some kind of tipping
               | point. Too much fake news, and it will flip. Too less, no
               | real influence is made.
               | 
               | The exact balance must be orchestrated by a human.
        
             | visarga wrote:
             | > imagine not being able to tell if this comment was
             | written by a human or robot
             | 
             | I think neural nets could help finding fake news and
             | factual mistakes. Then it wouldn't matter who wrote it if
             | it is helpful and true.
        
           | Yetanfou wrote:
           | That is like saying "black powder has existed for centuries,
           | are nuclear weapons really that big a problem?". The
           | difference between an image editor like Photoshop and an
           | automated image generation program is the far greater
           | production capacity, speed, the lower cost and the fact that
           | anyone with the right equipment can use it whereas the end
           | result of of an image editor only is as good as the person
           | using it,
        
         | turing_complete wrote:
         | Don't read news. Go to original sources and scientific papers.
         | If you really want to understand something, a news website
         | should only be your starting point to look for keywords. That
         | is true today as it will be "post-GPT-3".
        
           | wizzwizz4 wrote:
           | > _Go to original sources and scientific papers._
           | 
           | Given how much bunk "science" (and I'm talking things
           | completely transparent to someone remotely competent in the
           | field) gets published, especially in psychology, it's
           | difficult to do even that.
        
             | turing_complete wrote:
             | You are right. You still have to read critically or find
             | trusted sources, of course.
        
           | yread wrote:
           | Primary sources need to be approached with caution
           | https://clas.uiowa.edu/history/teaching-and-writing-
           | center/g...
        
             | gmueckl wrote:
             | And so does every other source. You can play that analysis
             | game with any source material. The problem is that the
             | accuracy and detail of the reporting usually fades with
             | each step towards mass media content.
        
           | savolai wrote:
           | This scales badly today and will scale even worse in the
           | future. Those without education or time resources will _at
           | best_ manage to read the news. Humanity will need a low
           | effort way to relay reliable information to more of it 's
           | members.
        
           | anticristi wrote:
           | Besides the time scalability aspect highlighted by someone
           | else, I am worried that GPT-3 will have the potential to
           | produce even "fake scientific papers".
           | 
           | Our trust fabric is already quite fragile post-truth. GPT-3
           | might make it even more fragile.
        
             | op03 wrote:
             | I wouldn't worry about the science bit. No one worries
             | about the university library getting larger and larger or
             | how its going to confuse or misguide people even though
             | everyone knows there are many books in there, full of
             | errors, outdated information, badly written, boring etc etc
             | etc.
             | 
             | Why? Cause there is always someone on campus who knows
             | something about a subject to guide you to the right stuff.
        
           | cbozeman wrote:
           | Most people are not smart enough to do this, and even if they
           | are, they don't have enough time in their day.
        
           | hntrader wrote:
           | That's what people _should_ do, and that 's what you and I
           | will do, but many won't, especially the less educated (no
           | condescension intended). They'll buy into the increased
           | proliferation of fake info. It's because of these people that
           | I think the concerns are valid.
        
             | anticristi wrote:
             | Honestly, I consider myself fairly educated (I have a PhD
             | in CS), but if the topic at hand is sufficiently far from
             | my core competence, then reading the scientific article
             | won't help. I keep reading about p-value hacking, subtle
             | ways of biasing research, etc., and I realize that, to
             | validate a scientific article, you have to be a domain
             | expert and constantly keep up-to-date with today's best
             | standards. Given the increasing number of domains to be an
             | expert in, I fail to see how any single human can achieve
             | that without going insane. :D
             | 
             | I mean, Pfizer could dump their clinical trial reports at
             | me, and I would probably be unable to compute their
             | vaccine's efficiency, let alone find any flaws.
        
         | taneq wrote:
         | The fake news thing is a real problem (and may become worse
         | under GPT3 but certainly exists already). As for the others -
         | to quote Westworld, "if you can't tell the difference, does it
         | really matter?"
        
           | phreeza wrote:
           | Genuine question, why is this a problem? Sure, someone may be
           | able to generate thousands of real-sounding fake news
           | articles, but it's not like they will also be able to flood
           | the New York Times with these articles. How do you worry you
           | will be exposed to these articles?
        
             | chimprich wrote:
             | It's not me I'm worried about - it's the 50% [1] of people
             | who get their news from social media and "entertainment"
             | news platforms. These people vote, and can get manipulated
             | into performing quite extreme acts.
             | 
             | At the moment a lot of people seem to have trouble engaging
             | with reality, and that seems to be caused by relatively
             | small disinformation campaigns and viral rumours. How much
             | worse could it get when there's a vast number of realistic-
             | sounding news articles appearing, accompanied by realistic
             | AI-generated photos and videos?
             | 
             | And that might not even be the biggest problem. If these
             | things can be generated automatically and easily, it's
             | going to be very easy to dismiss real information as fake.
             | The labelling of real news as "fake news" phenomenon is
             | going to get bigger.
             | 
             | It's going to be more work to distinguish what is real from
             | what is fake. If it's possible to find articles supporting
             | any position and a suspicion that any contrary new is then
             | a lot of people are going to find it easier to just believe
             | what they prefer to believe... even more than they do now.
             | 
             | [1] made-up number, but doesn't feel far off.
        
               | profunctor wrote:
               | That fact that you made up that number is extremely funny
               | in this context.
        
               | chimprich wrote:
               | I don't think so - I was aware that it was a made-up
               | number, and highlighted the fact that it was. It's the
               | lack of awareness of what is backed up by data that is
               | the problem I think.
               | 
               | Or am I missing your point?
        
               | _underfl0w_ wrote:
               | Right, it's definitely goof that you cited it being fake,
               | but I think the parent was pointing out the subtle (and
               | likely unintentional) irony of discussing fake news while
               | providing _fake_ numbers to support your opinion.
        
               | jokethrowaway wrote:
               | The majority of "fake news" are factual news described
               | from a partial point of view and with a political spin.
               | 
               | Even fact checkers are not immune to this and brand other
               | news as true or false not based on facts but based on the
               | political spin they favour.
               | 
               | Fake news is a vastly overstated problem. Thanks to
               | internet, we now have a wider breadth of political news
               | and opinions and it's easy to label everything-but-your-
               | side as fake news.
               | 
               | There are a few patently false lies on the internet which
               | are taken as examples of fake news - but they have very
               | few supporters.
        
               | mistermann wrote:
               | > There are a few patently false lies on the internet
               | which are taken as examples of fake news - but they have
               | very few supporters.
               | 
               | Very true. What's interesting though is how many
               | supporters there of the idea that extremely large
               | quantities of people 100% "hook, line, and sinker" buy
               | into such fake news stories - based, ironically, on
               | _fake-news-like_ articles assuring us (with specious
               | evidence, if any) that this is the true state of reality.
               | 
               | The world is amazingly paradoxical if you look at it from
               | the proper abstract angle.
        
               | chimprich wrote:
               | > Even fact checkers are not immune to this and brand
               | other news as true or false not based on facts but based
               | on the political spin they favour.
               | 
               | Could you give an example?
               | 
               | > There are a few patently false lies on the internet
               | which are taken as examples of fake news - but they have
               | very few supporters.
               | 
               | How many do you consider "few"?
               | 
               | I can go to my local news site and read a story about the
               | novel coronavirus and the _majority_ of comments below
               | the article are stating objectively false facts.
               | 
               | "It's just a flu" "Hospitals are empty" "The survival
               | rate is 99.9%" "Vaccines alter your DNA"
               | 
               | ...and so on.
               | 
               | There is the conspiracy theory or cult called QAnon,
               | which "includes in its belief system that President Trump
               | is waging a secret war against elite Satan-worshipping
               | paedophiles in government, business and the media."
               | 
               | One QAnon Gab group has more than 165,000 users. I don't
               | think these are small numbers.
        
               | perpetualpatzer wrote:
               | > made-up number, but doesn't feel far off.
               | 
               | Pew Research says 18% report getting news primarily from
               | social media (fielded 10/19-6/20)[0]. November 2019
               | research said 41% among 18-29 year olds, which was the
               | peak age group. Older folks largely watch news on TV[1].
               | 
               | [0] https://www.journalism.org/2020/07/30/americans-who-
               | mainly-g... [1] https://www.pewresearch.org/pathways-2020
               | /NEWS_MOST/age/us_a...
        
               | chimprich wrote:
               | Thanks for providing data. Evidence is better than making
               | up numbers.
        
             | hnlmorg wrote:
             | If recent times have told us anything, it's that the
             | biggest distributor of "news" is social media. And worse
             | still, people generally have no interest in researching the
             | items they read. If "fake news" confirms their pre-existing
             | bias then they will automatically believe it. If real news
             | disagrees with their biases then it is considered fake.
             | 
             | So in theory, the rise of deep fakes could lead to more
             | people getting suckered into conspiracy theories and other
             | such extreme opinions. We've already seen a small trend
             | this way with low resolution images of different people
             | with vaguely similar physical features because used as
             | "evidence" of actors in hospitals / shootings / terrorist
             | scenes / etc.
             | 
             | That all said, I don't see this as a reason not to pursue
             | GPT-3. From that regard the proverbial genie is already out
             | of the bottle. What we need to work on is a better
             | framework for distributing knowledge.
        
             | xerxespoy wrote:
             | Journalists are paid by the word.
        
           | qayxc wrote:
           | > "if you can't tell the difference, does it really matter?"
           | 
           | It indeed does. The problem is that societies and cultures
           | are heavily influenced and changed by communication, media,
           | and art.
           | 
           | By replacing big portions of these components with artificial
           | content, generated from previously created content, you run
           | the risk of creating feedback cycles (e.g. train future
           | systems from output of their predecessors) and forming
           | standards (beauty, aesthetics, morality, etc.) controlled by
           | the entities that build, train, and filter the output of the
           | AIs.
           | 
           | You'll basically run the risk of killing individuality and
           | diversity in culture and expression; consequences on society
           | as a whole and individual behaviour are difficult to predict,
           | but seeing how much power social media (an unprecedented
           | phenomenon in human culture) have, there's reason to at the
           | very least be cautious about this.
        
             | visarga wrote:
             | This problem affects all types of agents - natural or
             | artificial. Agent acts in the environment, this causes
             | experience and learning, and thus conditioning the future.
             | The agent has no idea what other opportunities are lost
             | behind past choices.
        
           | notahacker wrote:
           | Most human communications between humans have some physical
           | world purpose, and so an algorithm which is trained to
           | _create the impression_ that a purpose has been fulfilled
           | whilst not actually having any capabilities beyond text
           | generation is going to have negative effects except where the
           | sole purpose of interacting is receiving satisfactory text.
           | 
           | Reviews that look just like real reviews but are actually a
           | weighted average of comments on a different product are
           | negative. Customer service bots that go beyond FAQ to do a
           | very convincing impression of a human service rep promising
           | an investigation into an incident but can't actually start an
           | investigation into the incident are negative. An information
           | retrieval tool which has no information on a subject but can
           | spin a very plausible explanation based on data on a
           | different subject is negative.
           | 
           | Of course, it's entirely possible for humans to bullshit, but
           | unlike text generation algorithms it isn't our default
           | response to everything.
        
           | skybrian wrote:
           | If you ask GPT-3 for three Lord of the Rings quotes it might
           | give you two real ones and one fake one, because it doesn't
           | know what truth is and just wants to give you something
           | plausible.
           | 
           | There are creative applications for bullshit, but something
           | that cites its sources (so you can check) and doesn't
           | hallucinate things would be much more useful. Like a search
           | engine.
        
           | Drakim wrote:
           | What scares me personally is the idea that I might be
           | floating in a sea of uncanny valley content. Content that's
           | 98% human-like, but then that 2% sticks out like a nail and
           | snaps me out of it.
           | 
           | Sure, I might not be able to tell the difference the majority
           | of the time, but when I can tell the difference it's gonna
           | bother me a lot.
        
             | fakedang wrote:
             | To me, a lot of content seems to be digital marketing
             | horseshit tbh.
        
             | mistermann wrote:
             | Do you not already have this feeling on a fairly regular
             | basis? (Serious question)
        
         | normanmatrix wrote:
         | You will not. Welcome to the scary generative future.
        
           | anticristi wrote:
           | I was hoping for a "yes, we can" attitude here. :D
        
         | Agentlien wrote:
         | Deep fakes still feel quite uncanny valley to me. Even if they
         | move beyond that convincing fake images have existed for a long
         | while.
         | 
         | As for support, I don't really see why it matters if I'm
         | talking to a clever script or an unmotivated human.
        
         | falcor84 wrote:
         | > already struggling to distinguish ... automated customer
         | request replies from genuine replies
         | 
         | I hope it's not only due to a decline in the quality of human
         | support. If we could have really useful automated support
         | agents, I for one would applaud that.
        
           | anticristi wrote:
           | I agree. As long as it is transparent that I am speaking to
           | an automated agent and I can easily escalate the issue to a
           | human that can solve my problem when the agent gets stuck.
        
         | bencollier49 wrote:
         | We'll go full circle and you'll be forced to meet people in
         | person again.
        
       | Erlich_Bachman wrote:
       | It's a shame that it has turned out to be necessary to externally
       | re-make and re-train a model that has come out of company called
       | `OPEN`AI. Wasn't one of the founding principles of it that all of
       | the research would be available to the public? Isn't that the
       | premise on which the initial funding was secured? Best of luck to
       | Eleuther.
        
         | mhuffman wrote:
         | But I was told GPT-3 was too powerful for mere mortal hands
         | (unless you have an account!) and that it would be used for
         | hate speech and to bring about skynet.
         | 
         | How will this project avoid those terrible outcomes?
        
           | visarga wrote:
           | By putting the cat back in the bag. Oh, it's too late ...
           | useless to think about it - we can't de-invent an invention
           | or stop people from replicating. Its like that time when NSA
           | wanted to restrict crypto.
        
           | dvfjsdhgfv wrote:
           | I don't know a single intelligent person who believed this
           | argument, it simply doesn't hold up.
        
             | thelastwave wrote:
             | Lots of people "believe" that, they just prefer to downvote
             | anonymously rather than try to defend their position.
        
           | ForHackernews wrote:
           | New research has revealed that intelligence is not a
           | prerequisite for generating hate speech on social media
           | platforms.
        
         | visarga wrote:
         | It was probably bait and switch to hire top researchers and get
         | initial funding. Now that OpenAI is a household name, they
         | don't have to pretend anymore.
        
           | b3kart wrote:
           | I buy the former, researchers might be happier knowing their
           | work potentially benefits all of humanity, not just a bunch
           | of investors. But wouldn't it be _more_ difficult to get
           | funding as a non-profit?
        
             | littlestymaar wrote:
             | It's just never going to be difficult to get funding when
             | you have Elon Musk and Sam Altman as founders (and even
             | more so when founders put one billion of their own money
             | into it).
        
               | b3kart wrote:
               | Sure, but that's OpenAI's particular set of
               | circumstances. Generally speaking I struggle to see
               | investors preferring a nebulous non-profit over a for-
               | profit with a clear path to market.
        
               | littlestymaar wrote:
               | Sure, but we're explicitly talking about OpenAI here.
        
               | b3kart wrote:
               | Of course. It's just that the comment I've been
               | responding to suggested OpenAI going the "open"/non-
               | profit route was to 1) get top researchers and 2) get
               | investment. I was arguing that this doesn't seem to
               | (generally) be a good way to get investment, but I agree
               | with you in that in their case investment just wasn't a
               | consideration at all.
        
         | spiderfarmer wrote:
         | I don't really care if OpenAI offers commercial licenses as
         | long as the underlying research is truly open. This way
         | alternative options will become available eventually.
        
           | querez wrote:
           | Arguably openAI is one of the most closed industry AI labs
           | (among those that are still participating in the research
           | community), on par only with deep mind (though deepmind at
           | least publishes way more). Funnily enough, FAIR and Google
           | Brain have a vastly better track record wrt. publishing not
           | only papers but also code and models.
        
         | dave_sullivan wrote:
         | Really. OpenAI assembled some of the best minds from the deep
         | learning community. The problem isn't that they are a for-
         | profit SaaS, the problem is they lied.
        
           | thelastwave wrote:
           | And ended up making an AI service that's really good at...
           | lying.
        
         | Sambdala wrote:
         | Wild-Ass Guess (Ass-Guess) incoming:
         | 
         | OpenAI was built to influence the eventual _value chain_ of AI
         | in directions that would give the funding parties more
         | confidence that their AI bets would pay off.
         | 
         | This value chain basically being one revolving around AI as
         | substituting predictions and human judgement in a business
         | process, much like cloud can be (oversimply) modeled as moving
         | Capex to Opex in IT procurement.
         | 
         | They saw that, like any primarily B2B sector, the value chain
         | was necessarily going to be vertically stratified. The output
         | of the AI value chain is as an input to another value chain,
         | it's not a standalone consumer-facing proposition.
         | 
         | The point of OpenAI is to invest/incubate a Microsoft or Intel,
         | not a Compaq or Sun.
         | 
         | They wanted to spend a comparatively small amount of money to
         | get a feel for a likely vision of the long-term AI value chain,
         | and weaponize selective openness to: 1) establish moats, 2)
         | Encourage commodification of complementary layers which add
         | value to, or create an ecosystem around, 'their' layer(s), and
         | 3) Get insider insight into who their true substitutes are by
         | subsidizing companies to use their APIs
         | 
         | As AI is a technology that largely provides benefit by
         | modifying business processes, rather than by improving existing
         | technology behind the scenes, your blue ocean strategy will
         | largely involve replacing substitutes instead of displacing
         | direct competitors, so points 2 and 3 are most important when
         | deciding where to funnel the largest slice of the funding pie.
         | 
         | _Side Note: Becoming an Apple (end-to-end vertical integration)
         | is much harder to predict ahead of time, relies on the 'taste'
         | and curation of key individuals giving them much of the
         | economic leverage, and is more likely to derail along the way._
         | 
         | They went non-profit to for-profit after they confirmed the
         | hypothesis that they can create generalizeable base models that
         | others can add business logic and constraints to and generate
         | "magic" without having to share the underlying model.
         | 
         | In turn, a future AI SaaS provider can specialize in tuning the
         | "base+1" model, then selling that value-add service to the
         | companies who are actually incorporating AI into their business
         | processes.
         | 
         | It turned out, a key advantage at the base layer is just brute
         | force and money, and further outcomes have shown there doesn't
         | seem to be an inherent ceiling to this; you can just spend more
         | money to get a model which is unilaterally better than the last
         | one.
         | 
         | There is likely so much more pricing power here than cloud.
         | 
         | In cloud, your substitute (for the category) is buying and
         | managing commodity hardware. This introduces a large-ish
         | baseline cost, but then can give you more favorable unit costs
         | if your compute load is somewhat predictable in the long term.
         | 
         | More importantly, projects like OpenStack and Kubernetes have
         | been desperately doing everything to commodotize the base layer
         | of cloud, largely to minimize switching costs and/or move the
         | competition over profits up to a higher layer. You also have
         | category buyers like Facebook, BackBlaze, and Netflix investing
         | heavily into areas aimed at minimizing the economic power of
         | cloud as a category, so they have leverage to protect their own
         | margins.
         | 
         | It's possible the key "layer battle" will be between the
         | hardware (Nvidia/TPUs) and base model (OpenAI) layers.
         | 
         | It's very likely hardware will win this for as long as they're
         | the bottleneck. If value creation is a direct function of how
         | much hardware is being utilized for how long, and the value
         | creation is linear-ish as the amount of total hardware scales,
         | the hardware layer just needs to let a bidding war happen, and
         | they'll be capturing much of the economic profit for as long as
         | that continues to be the case.
         | 
         | However, the hardware appears (I'm no expert though) to be
         | something that is easier to design and manufacture, it's mostly
         | a capacity problem at this point, so over time this likely gets
         | commoditized (still highly profitable, but with less pricing
         | power) to a level where the economic leverage goes to the Base
         | model layer, and then the base layer becomes the oligopsony
         | buyer, and the high fixed investment the hardware layer made
         | then becomes a problem.
         | 
         | The 'Base+1' layer will have a large boom of startups and
         | incumbent entrants, and much of the attention and excitement in
         | the press will be equal parts gushing and mining schaudenfreude
         | about that layer, but they'll be wholly dependent on their
         | access to base models, who will slowly (and deliberately) look
         | more and more boring apart from the occasional handwringing
         | over their monopoly power over our economy and society.
         | 
         | There will be exceptions to this who are able to leverage
         | proprietary data and who are large enough to build their own
         | base models in-house based on that data, and those are likely
         | to be valuable for their internal AI services preventing an
         | 'OpenAI' from having as much leverage over them and being much
         | better matched to their process needs, but they will not be as
         | generalized as the models coming from the arms race of
         | companies who see that as their primary competitive advantage.
         | Facebook and Twitter are two obvious ones in this category, and
         | they will primarily consume their own models, rather than
         | expose them as model-as-a-service directly.
         | 
         | The biggest question to me is whether there's a feedback loop
         | here which leads to one clear winning base layer company
         | (probably the world's most well-funded startup to date due to
         | the inherent upfront costs and potential long-term income), or
         | if multiple large, incumbent tech companies see this as an
         | existential enough question that they more or less keep pace
         | with each other, and we have a long-term stable oligopoly of
         | mostly interchangeable base layers, like we do in cloud at the
         | moment.
         | 
         | Things get more complex when you look to other large investment
         | efforts such as in China, but this feels like a plausible
         | scenario for the SV-focused upcoming AI wars.
        
           | visarga wrote:
           | Apparently you don't need to be a large company to train
           | GPT-3. EleutherAI is using free GPU from CoreWeave, the
           | largest North American GPU miner, who agreed to this deal to
           | get the final model open sourced and have their name on it.
           | They are also looking at offering it as an API.
        
             | Sambdala wrote:
             | I think it's great they're doing this, but GPT-3 is the
             | bellwether not the end state.
             | 
             | Open models will function a lot like Open Source does
             | today, where there are hobby projects, charitable projects,
             | and companies making bad strategic decisions (Sun open
             | sourcing Java), but the bulk of Open AI (open research and
             | models, not the company) will be funded and released
             | strategically by large companies trying to maintain market
             | power.
             | 
             | I'm thinking of models that will take $100 million to $1
             | billion to create, or even more.
             | 
             | We spend billions on chip fabs because we can project out
             | long term profitability of a huge upfront investment that
             | gives you ongoing high-margin capacity. The current
             | (admittedly early and noisy) data we have about AI models
             | looks very similar IMO.
             | 
             | The other parallel is that the initial computing revolution
             | allowed a large scale shift of business activities from
             | requiring teams of people doing manual activities,
             | coordinated by a supervisor towards having those functions
             | live inside a spreadsheet, word processor, or email.
             | 
             | This replaces a team of people with (outdated)
             | specializations with fewer people accomplishing the same
             | admin/clerical work by letting the computer do what it's
             | good at doing.
             | 
             | I think a similar shift will happen with AI (and other
             | technologies) where work done by humans in cost centers is
             | retooled to allow fewer people to do a better job at less
             | cost. Think compliance, customer support, business
             | intelligence, HR, etc.
             | 
             | If that ends up being the case, donating a few million
             | dollars worth of GPU time doesn't change the larger trends,
             | and likely ends up being useful cover as to why we
             | shouldn't be worried about what the large companies are up
             | to in AI because we have access to crowdsourced and donated
             | models.
        
           | jariel wrote:
           | This is neat, but almost no startups of any kind, even mid
           | size corps, have such complicated and intricate plans.
           | 
           | More likely: OpenAI was a legit premise, they started to run
           | out of money, MS wanted to license and it wasn't going to
           | work otherwise, so they just took the temperature with their
           | initial sponsors and staff and went commercial.
           | 
           | And that's it.
        
           | ccostes wrote:
           | I think calling this a "wild-ass guess" undersells it a bit
           | (either that or we have very different definitions of a
           | WAG).Very well though-through and compelling case.
           | 
           | My biggest question is whether composable models are indeed
           | the general case, which you say they confirmed as evidenced
           | by the shift away from non-profit. It's certainly true for
           | some domains, but I wonder if it's universal enough to enable
           | the ecosystem you describe.
        
         | wraptile wrote:
         | OpenAI turning out to be a total bait and switch. Especially
         | true when your co-founder is actively calling you out on it[1]
         | 
         | Remember kids: if it's not a non-profit organization it is a
         | _for_ profit one! It was silly to expect anything else:
         | 
         | > In 2019, OpenAI transitioned from non-profit to for-profit.
         | The company distributed equity to its employees and partnered
         | with Microsoft Corporation, who announced an investment package
         | of US$1 billion into the company. OpenAI then announced its
         | intention to commercially license its technologies, with
         | Microsoft as its preferred partner [2]
         | 
         | 1 - https://edition.cnn.com/2020/09/27/tech/elon-musk-tesla-
         | bill...
         | 
         | 2 - https://en.wikipedia.org/wiki/OpenAI
        
           | person_of_color wrote:
           | So OpenAI employees get Microsoft RSUs?
        
             | unixhero wrote:
             | What is an RSU?
        
               | ourcat wrote:
               | Restricted Stock Units
        
               | agravier wrote:
               | It means restricted stock unit, and it's a kind of
               | company stock unit that may be distributed to some
               | "valued" employees. There is usually a vesting schedule,
               | and you can't do whatever you want with it.
        
             | garmaine wrote:
             | Why would they? It's a separate company.
        
           | dvfjsdhgfv wrote:
           | It will be interesting to see the attitude of Microsoft
           | towards this project in the light of their "Microsoft loves
           | open source" propaganda.
        
             | eeZah7Ux wrote:
             | Like many other companies, Microsoft loves unpaid labor.
             | 
             | Free Software is about giving freedom and security all the
             | way to the end users - rather than SaaS providers.
             | 
             | If you remove this goal and only focus on open source as a
             | development methodology you end up with something very
             | similar to volunteering for free for some large
             | corporation.
        
             | Closi wrote:
             | I don't know where people got this idea that Microsoft
             | can't participate positively in Open Source, and do that
             | sincerely, without open sourcing absolutely everything.
             | 
             | Of course they can - just because you contribute to open
             | source, and do that because you also benefit from open
             | source projects, doesn't mean you have to do absolutely
             | everything under open source.
             | 
             | Especially considering OpenAI isn't even Microsoft's IP or
             | codebase.
        
               | taf2 wrote:
               | How about when Steve Ballmer said something along the
               | lines of
               | 
               | "Linux is a cancer that attaches itself in an
               | intellectual property sense to everything it touches"
               | 
               | Pretty sure that is hostile towards open source? Linux
               | being one of the flagship projects of open source.
               | 
               | [edit] source https://www.zdnet.com/article/ex-windows-
               | chief-heres-why-mic...
        
               | Closi wrote:
               | It's hostile to the GPL licence which means anything
               | licensed under GPL can't be used in Microsoft's
               | proprietary products.
               | 
               | I would personally say Microsoft wasn't necessarily
               | driven by anti open-source hate necessarily, they were
               | just very anti-competitor. Microsoft tried to compete
               | with their biggest competitor? Colour me shocked.
        
               | Daho0n wrote:
               | I don't think this should be seen in the light of "open
               | source everything" but more that many see Microsoft doing
               | open source not as part of "being good" but part of their
               | age old "embrace extend extinguish" policy.
        
               | dvfjsdhgfv wrote:
               | > I don't know where people got this idea that Microsoft
               | can't participate positively in Open Source, and do that
               | sincerely, without open sourcing absolutely everything.
               | 
               | I'm not claiming that. Of course there is place for
               | closed and open elements of their offerings. Let me
               | clarify.
               | 
               | In the past, Microsoft was very aggressive about open
               | source. When they realized this strategy of FUD brings
               | little result, they changed their attitude 180 and
               | decided to embrace it putting literal hearts everywhere.
               | 
               | Personally, I find it hypocritical. There is no
               | love/hate, just business. They will use whatever strategy
               | works to get their advantage. What I find strange is that
               | people fell for it.
        
               | Closi wrote:
               | But why on this thread then, about GPT-3? It's not even
               | their own company, IP or source to give away.
               | 
               | But even when Microsoft _can 't_ open source it because
               | it's _not theirs_ , we _still_ have people posting in
               | this thread that this is further evidence that Microsoft
               | is hypocritical. It sounds a lot like a form of
               | Confirmation Bias to me where any evidence is used as
               | proof that Microsoft is  'anti-open-source'.
        
               | taf2 wrote:
               | I think it is because each model from OpenAi was public
               | until Microsoft became an investor.
        
               | [deleted]
        
               | pessimizer wrote:
               | I don't know where people got the idea that companies can
               | be "sincere." Sincerity is faithfully communicating your
               | mental state to others. A company's mental state can
               | change on a dime based on the decisionmaking of people
               | who rely on the company for the degree of profit it
               | generates. Any analog to sincerity that you think you see
               | can probably be eliminated by firing one person after an
               | exceptionally bad quarter (or an exceptionally good one.)
        
               | Closi wrote:
               | Sincere to me just means that you are being truthful, or
               | not trying to be deceptive.
               | 
               | And I think companies can be sincere - because companies
               | are really just groups of people and assets when you get
               | down to the nuts and bolts of it.
        
               | eeZah7Ux wrote:
               | > companies can be sincere
               | 
               | "sincere", "honest", "hypocritical" usually refers to a
               | long-term pattern. Being able to be sincere from time to
               | time is besides the point.
               | 
               | > companies are really just groups of people
               | 
               | ...with profit as their first priority.
               | 
               | For-profit companies "can be sincere" only as long as
               | it's the most profitable strategy.
        
         | JacobiX wrote:
         | It's a recurring theme in OpenAI research, they become more and
         | more closed. For instance their latest model called DALL*E hit
         | the headlines before the release of the paper. Needless to say,
         | the model is not available and no online demo has been
         | published so far.
        
           | cbozeman wrote:
           | Because its winner-take-all in this research, not "winner-
           | take-some".
           | 
           | Andrew Yang talked about this and why breaking up Big Tech
           | won't work. No one wants to use the second best search
           | engine. The second best search engine is Bing and I almost
           | never go there.
           | 
           | Tech isn't like automobiles, where you might prefer a Honda
           | over a Toyota, but ultimately they're interchangeable. A
           | Camry isn't dramatically different and doesn't perform
           | dramatically better than an Accord. Whoever builds the best
           | AI "wins" and wins totally.
        
           | visarga wrote:
           | But they still released the CLIP model which is the
           | complement of DALL-E and used in the DALL-E pipeline as a
           | final filter. There are collabs with CLIP floating around and
           | even a web demo.
        
             | JacobiX wrote:
             | Thank you for this info, as you mentioned CLIP is used for
             | re-ranking DALL-E outputs, by itself it is just an (image,
             | text) pairs classification network.
        
         | Tenoke wrote:
         | The research is open to the public. Here's the gpt3 paper
         | https://arxiv.org/abs/2005.14165
         | 
         | Also gpt2 models and code at least were publicly released and
         | so has a lot of their work.
         | 
         | And yes, they realized they can achieve more by turning for
         | profit and partnering with Microsoft. So true, they are not
         | fully 'open' but pretending they don't release things to the
         | public and making the constant 'more like closedai aimirite'
         | comments is getting old.
        
       | avivo wrote:
       | I'd love to see an equal amount of the effort put toward
       | initiatives like this, also being put toward mitigating their
       | _extremely likely_ negative societal impacts (and putting in
       | safeguards).
       | 
       | Of course, that's not nearly as sexy.
       | 
       | Yes, there are lots of incredible positive impacts of such
       | technology, just like there was with fire, or nuclear physics.
       | But that doesn't mean that safeguards aren't _absolutely
       | critical_ if you want it to be net win for society.
       | 
       | These negative impacts are not theoretical. They are obvious and
       | already a problem for anyone who works in the right parts of the
       | security and disinformation world.
       | 
       | We've been through all this before...
       | https://aviv.medium.com/the-path-to-deepfake-harm-da4effb541...
       | 
       | Of course, some of the same people who ignored recommendations[1]
       | for harm mitigations in visual deepfake synthesis tools (which
       | ended up being used for espionage and botnets) seem to be working
       | on this.
       | 
       | [1] e.g.
       | https://www.technologyreview.com/2019/12/12/131605/ethical-d...
        
       | mrfusion wrote:
       | It still baffles me that GPT turned out to be more than a
       | glorified markov chain text generator. It seems we've actually
       | made it create a model of the world to some degree.
       | 
       | And we kind of just stumbled on the design by throwing massive
       | data and neural networks together?
        
         | nullc wrote:
         | You're made of _meat_ and yet you manage to be more than a
         | glorified markov chain generator. :)
         | 
         | (I hope)
        
         | Filligree wrote:
         | It turns out that brute-force works, and the scaling curve is
         | _still_ not bending.
         | 
         | I doubt we'll ever see a GPT-4, because there are known
         | improvements they could make besides just upsizing it further,
         | but that's besides the point. If that curve doesn't bend soon
         | then a 10x larger network would be human-level in many ways.
         | 
         | (Well, that is to say. It's actually bending. Upwards.)
        
           | hntrader wrote:
           | What % of all digitized and reasonably easy-to-access text
           | data did they use to train GPT-3? I'm wondering whether the
           | current limits on GPT-n are computation or data.
        
             | kortex wrote:
             | > As per the creators, the OpenAI GPT-3 model has been
             | trained about 45 TB text data from multiple sources which
             | include Wikipedia and books.
             | 
             | It's about 400 B tokens. Library if Congress is about 40M
             | books, let's say 50K tokens per book, or about 2T tokens.
             | Not necessarily unique.
             | 
             | I would say it's plausible that it was a decent percent of
             | the indexed text available, and even more of the unique
             | content. GPT2 was 10B tokens. Do we have 20T tokens
             | available for GPT4? Maybe. But the low hanging fruit are
             | definitely plucked.
        
           | mrfusion wrote:
           | So fascinating. I'd love to understand why it's working so
           | well. I guess no one knows.
           | 
           | Wouldn't gpt4 just be more data and more parameters?
        
       | nemoniac wrote:
       | Good initiative but tell us more about the governance. After all
       | OpenAI was "open" until it was bought by Microsoft.
        
         | wizzwizz4 wrote:
         | No, it wasn't. And iirc, only GPT-3 was.
        
       | joshlk wrote:
       | How does the outfit intend to fund the project? OpenAI spends
       | millions on computing resources to train the models.
        
         | jne2356 wrote:
         | The cloud company CoreWeave has agreed to provide the GPU
         | resources necessary.
        
         | stellaathena wrote:
         | Hey! One of the lead devs here. A cloud computing company
         | called CoreWeave is giving us the compute for free in exchange
         | for us releasing it. We're currently at the ~10B scale and are
         | working on understanding datacenter scale parallelized training
         | better, but we expect to train the model on 300-500 V100s for
         | 4-6 months.
        
         | tmalsburg2 wrote:
         | I imagine recreating the model will be computationally cheaper
         | because they will not have to sift through the same huge
         | hyperparameter space as the initial GPT-3 team had to.
        
           | thelastwave wrote:
           | Why is that?
        
           | jne2356 wrote:
           | This is not true. The OpenAI team only trained one full-sized
           | GPT-3, and conducted their hyperparameter sweep on
           | significantly smaller models (see:
           | https://arxiv.org/abs/2001.08361). The compute savings from
           | not having to do the hyperparameter sweep are negligible and
           | do not significantly change the feasibility of the project.
        
       | 2Gkashmiri wrote:
       | So how much money would it take to rebuild this foss alternative
       | ? And distributive power like seti@home? If it can be done and I
       | hope it does, what benefit would the original proprietary one
       | have over this? Licensing?
        
         | astrange wrote:
         | OpenAI will execute the original one for you. If you can get an
         | account, anyway.
        
         | jne2356 wrote:
         | EleutherAI has already secured the resources necessary.
         | 
         | They get the seti@home suggestion a lot. There's a section in
         | their FAQ that explains why it's infeasible.
         | 
         | https://github.com/EleutherAI/info
        
       | pjfin123 wrote:
       | What does the future of open-source large neural nets look like?
       | My understanding is GPT-3 takes ~600GB of GPU memory to run
       | inference. Does an open source model just allow you a choice of a
       | handful of cloud providers instead of one?
        
         | aabhay wrote:
         | Open source doesn't mean that everyone will be rolling their
         | own. It means that lots of players will start to offer
         | endpoints with GPT-X, perhaps bundled with other services. It
         | is good for the market.
        
       | mirekrusin wrote:
       | I'd gladly contribute (power and) few of idle GTX cards I have to
       | public peer/volunteer/seti@home-like project if result
       | snapshot(s) are available publicly/to registered, active
       | contributors.
        
         | Voloskaya wrote:
         | SETI@home style distributed computation is not suitable for
         | training something like GPT-3, unlike for SETI, the unit of
         | work a node can do before needing to share it's output with the
         | next node is really small, so very fast interconnect between
         | the nodes is needed (Infiniband and NVLink is used in clusters
         | to train it). It would probably take a decade to train such a
         | model over regular internet.
        
           | mirekrusin wrote:
           | Are there any models/research optimised on working on this
           | kind of small, distributed batches that would fit ie. ~10GB
           | of commodity GPU?
        
           | mitjam wrote:
           | Maybe a case for a community colocation cloud where I a
           | consumer can buy a system and colocate it in a large data
           | center with great internal networking. Edit: typo
        
             | leogao wrote:
             | Handling heterogenous (and potentially untrustworthy)
             | systems also adds overhead, not to mention that buying
             | hardware in bulk is cheaper, so it makes the most sense
             | just to raise the money and buy the hardware.
        
               | mirekrusin wrote:
               | The problem is potentially solvable as generating
               | solutions takes a lot of GPU time and verifying it is
               | very fast. Aquiring input data may be a problem, but
               | should be possible with dedicated models for this type of
               | computation.
        
       | dmingod666 wrote:
       | With Open-AI being corporate controlled and not really 'Open'. Is
       | Neo a nod at 'The Matrix'?
        
       | [deleted]
        
       | habitue wrote:
       | Is it standard to prune these kinds of large language models once
       | they've been trained to speed them up?
        
       | dvfjsdhgfv wrote:
       | If they succeed, Eleuther should change their name to
       | ReallyOpenAI.
        
         | stellaathena wrote:
         | Or for extra irony, ClosedAI
        
       | techlatest_net wrote:
       | Is there any real justification behind this fear of close nature
       | of OpenAI or this is just frustration coming out? We had this
       | debate of closed Vs open source 20 years back and eventually
       | opensource won it because of various reasons. Won't those same
       | reasons apply to this situation of close nature of OpenAI? If so
       | then why are people worried about this? What is differnt this
       | time?
        
         | pmontra wrote:
         | The cost.
         | 
         | Closed source and open source developers use the same
         | $300-3,000 laptops / desktops. Everybody can afford them.
         | 
         | Training a large model in a reasonable time costs much more.
         | According to https://lambdalabs.com/blog/demystifying-gpt-3/
         | the cost of training GPT-3 was $4.6 million. Multiply it by the
         | number of trial and errors.
         | 
         | Of course we can't expect that something that costs tens or
         | hundreds of millions will be given away for free or to be able
         | to rebuild it without some collective training effort that
         | distributes the cost on at least thousands of volunteers.
        
           | qayxc wrote:
           | This. Plus the increasing amount of intransparent results.
           | Training data is private, so it's impossible to even try to
           | recreate results, validate methods, or find biases/failure
           | cases.
        
           | jne2356 wrote:
           | OpenAI only trained the full sized GPT-3 once. Hyperparameter
           | sweep was conducted on significantly smaller models (see:
           | https://arxiv.org/abs/2001.08361)
        
       | ttctciyf wrote:
       | I love the name's play on Greek _Eleutheria_ ( "eleutheria") -
       | freedom, liberty!
        
       | Havoc wrote:
       | Would be good if this could decentralized bittorrent/BOINC style
       | somehow.
       | 
       | Wouldn't mind contributing some horsepower
        
         | jne2356 wrote:
         | They get this suggestion a lot. There's a section in their FAQ
         | that explains why it's infeasible.
         | 
         | https://github.com/EleutherAI/info
        
       | onenightnine wrote:
       | this is beautiful. why not? maybe we can make something
       | eventually better than the now closed source version
        
       | Mizza wrote:
       | Serious question: is there a warez scene for trained models yet?
       | 
       | (I don't know how the model is accessed - are users of mainline
       | GPT-3 given a .pb and a stack of NDAs, or do they have to access
       | it through access-controlled API?)
       | 
       | Wherever data is desired by many but held by a few, a pirate crew
       | inevitably emerges.
        
         | jokowueu wrote:
         | I think this also might be an interest to you
         | 
         | https://the-eye.eu/public/AI/pile_preliminary_components/
        
           | MasterScrat wrote:
           | Those are datasets though, not models.
        
           | notretarded wrote:
           | Not really
        
         | Voloskaya wrote:
         | Checkpoint is not shared with customers, you only get access to
         | an API endpoint.
        
         | vessenes wrote:
         | GPT-3 users are given an API link which routes to Azure, full
         | blackbox.
        
         | exhilaration wrote:
         | It's via API https://openai.com/blog/openai-api/
        
         | kordlessagain wrote:
         | The model is huge and is currently run in the cloud on many
         | machines.
        
           | mortehu wrote:
           | It's only 175 billion parameters, so presumably it can fit on
           | a single computer with 1024 GB RAM.
        
             | Voloskaya wrote:
             | On CPU the latency would be absolutly prohibitive to the
             | point of being useless.
        
               | typon wrote:
               | For training yes, but not for inference.
        
               | leogao wrote:
               | The inference latency would also be prohibitive.
        
               | kordlessagain wrote:
               | From 2019: https://heartbeat.fritz.ai/deep-learning-has-
               | a-size-problem-...
               | 
               | > Earlier this year, researchers at NVIDIA announced
               | MegatronLM, a massive transformer model with 8.3 billion
               | parameters (24 times larger than BERT)
               | 
               | > The parameters alone weigh in at just over 33 GB on
               | disk. Training the final model took 512 V100 GPUs running
               | continuously for 9.2 days.
               | 
               | Running this model on a "regular" machine at some useful
               | rate is probably not possible at this time.
        
               | Voloskaya wrote:
               | Inference on GPU is already very slow on the full-scale
               | non-distilled model (in the 1-2 sec range iirc), on CPU
               | it would be an order of magnitude more.
        
             | stingraycharles wrote:
             | Wouldn't you need this model to be in GPU RAM instead of
             | regular RAM, though?
        
       ___________________________________________________________________
       (page generated 2021-01-18 23:00 UTC)