[HN Gopher] GPT-Neo - Building a GPT-3-sized model, open source ... ___________________________________________________________________ GPT-Neo - Building a GPT-3-sized model, open source and free Author : sieste Score : 638 points Date : 2021-01-18 09:13 UTC (13 hours ago) (HTM) web link (www.eleuther.ai) (TXT) w3m dump (www.eleuther.ai) | rexreed wrote: | Once Microsoft became the sole and exclusive licensee of GPT-3, a | credible open source effort was bound to emerge, and I hope it | really is an effective alternative. | fakedang wrote: | Eleuther works with Google. I'd rather prefer the Microsoft | demon rather than the Google Demon. | jne2356 wrote: | The resources for the GPT-3 replication will be provided by | the cloud company CoreWeave. | kashyapc wrote: | Incidentally, over the weekend I listened to a two-hour-long | presentation[1] by the cognitive scientist, Mark Turner[2]. The | talk's main goal was to explore the question _" Does cognitive | linguistics offer a way forward for second-language teaching?"_ | | During the Q&A, Turner explicitly mentions GPT-3 (that's when I | first heard of it) as a "futuristic (but possible) language- | learning environment" that is likely to be a great boon for | second-language learners. One of the appealing points seems to be | "conversation [with GPT-3] is not scripted; it keeps going on any | subject you like". Thus allowing you to simulate some bits of the | gold standard (immersive language learning in the real world). | | As an advanced Dutch learner (as my fourth language), I'm curios | of these approaches. And glad to see this open source model. (It | is beyond ironic that the so-called "Open AI" turned out to have | dubious ethics.) | | [1] https://www.youtube.com/watch?v=A4Q977p8PfQ | | [2] Check out the _excellent_ book he co-authored, _" Clear and | simple as the truth"_--it has valuable insights on improving | writing based on some robust research. | Jack000 wrote: | Hope there will be a distilled or approximate attention version | so it can be run on consumer gpus. | wumms wrote: | https://github.com/EleutherAI/gpt-neo (Couldn't find it on the | website) | matteocapucci wrote: | Do you know how can one donate? | newswasboring wrote: | The intention behind it is pretty good. Best of luck to them. | | I wonder if I can donate computing power to this remotely. Like | the old SETI or protein folding things. Use idle CPU to calculate | for the network. Otherwise the estimates I have seen on how much | it would take to train these models are enormous. | mryab wrote: | Not directly related, but the Learning@home [1] project aims to | achieve precisely that goal of public, volunteer-trained neural | networks. The idea is that you can host separate "experts," or | parts of your model (akin to Google's recent Switch | Transformers paper) on separate computers. | | This way, you never have to synchronize the weights of the | entire model across the participants -- you only need to send | the gradients/activations to a set of peers. Slow connections | are mitigated with asynchronous SGD and unreliable/disconnected | experts can be discarded, which makes it more suitable for | Internet-like networks. | | Disclaimer: I work on this project. We're currently | implementing a prototype, but it's not yet GPT-3 sized. Some | issues like LR scheduling (crucial for Transformer convergence) | and shared parameter averaging (for gating etc.) are tricky to | implement for decentralized training over the Internet. | | [1] https://learning-at-home.github.io/ | echelon wrote: | Do you have a personal Twitter account I can follow? Your | career is one I'd like to follow. | teruakohatu wrote: | TensorFlow supports distributed training with a client server | model. | newswasboring wrote: | Does it also solve the problem of everyone having different | hardware? | londons_explore wrote: | It does. | | For most models, your home broadband would be far too slow | though. | emteycz wrote: | What about some kind of sharding, parts of the | computation that could be executed in isolation for a | longer period of time? | Filligree wrote: | An ongoing research problem. OpenAI would certainly like | being able to use smaller GPUs, instead of having to fit | the entire model into one. | jne2356 wrote: | GPT-3 does not fit in any one GPU that exists at present. | It's already spread out across multiple GPUs. | newswasboring wrote: | Is it because they will have to communicate back errors | during training? I forgot that training these models is | more of a global task than proteins folding. In that | sense this is less parallelizable over the internet. | londons_explore wrote: | Yes, and also activations if your GPU is too small to fit | the whole model. The minimum useful bandwidth for that | stuff is a few gigabits... | jne2356 wrote: | They get this suggestion a lot. There's a section in their FAQ | that explains why it's infeasible. | | https://github.com/EleutherAI/info | chillee wrote: | The primary issue is that large scale GPU training is primarily | dominated by communication costs. Since to some approximation | things need to be synchronized after every gradient update, it | becomes very quickly quite infeasible to increase the | communication cost. | danwills wrote: | Yeah! sounds great, if I could easily run a SETI-at-home style | thing to contribute to training a (future) model similar to | GPT-x, but with the result freely available to play with, I | reckon i'd do it. Could even be made into a game I'd say! I am | totally aware that gpt3 itself can't be run on a regular | workstation, but maybe hosting for instances of models from the | best/most-interesting training runs could be worked out by | crowd-funding? | colobas wrote: | I was just gonna propose something like this. Democratizing | large ML models. | hooande wrote: | In my experience, the output from GPT-3, DALL-E, et al is similar | to what you get from googling the prompt and stitching together | snippets from the top results. These transformers are trained on | "what was visible to google", which provides the limitation on | their utility. | | I think of the value proposition of GPT-X as "what would you do | with a team of hundreds of people who can solve arbitrary | problems only by googling them?". And honestly, not a lot of | productive applications come to mind. | | This is basically the description of a modern content farm. You | give N people a topic, ie "dating advice", and they'll use google | to put together different ideas, sentences and paragraphs to | produce dozens of articles per day. You could also write very | basic code with this, similar to googling a code snippet and | combining the results from the first several stackoverflow pages | that come up (which, incidentally, is how I program now). After a | few more versions, you could probably use GPT to produce fiction | that matches the quality of the average self published ebook. And | DALL-E can come up with novel images in the same way that a | graphic designer can visually merge the google image results for | a given query. | | One limitation of this theoretical "team of automated googlers" | is that what they search for content is cached on the date of the | last GPT model update. Right now the big news story is the Jan | 6th, 20201 insurrection at the US Capitol. GPT-3 can produce | infinite bad articles about politics, but won't be able to say | anything about current events in real time. | | I generally think that GPT-3 is awesome, and it's a damn shame | that "Open"AI couldn't find a way to actually be open. At this | point, it seems like a very interesting technology that is still | in desperate need of a Killer App | devonkim wrote: | The current problem is that we don't have a reliable, scalable | way to merge in features of knowledge engines that have | ontological relationships of entities with generative engines | that are good for making more natural looking or sounding | qualitative output. There's certainly research going on to join | them together but it's just not getting the kind of press | releases as the generative and pattern recognition stuff that's | much easier comparatively. The whole "General AI Complete" | class of problems seems to be ones that are trying to combine | multiple areas of more specific AI systems but that's exactly | where more practical problems for the average person arise. | glebshevchuk wrote: | Agreed, but that's because they're hard to integrate | together: one is concerned with enumerating over all facts | that humans know about (a la Cyc) and the other is concerned | with learning those directly from data. Developing feedback | systems that combine these two would be quite exciting. | dnautics wrote: | > And honestly, not a lot of productive applications come to | mind | | So, can't go into too many details, since I haven't started | yet, I'm thinking about mixing a flavor of GPT with DETR for | OCR tasks where the model then must predict categorization | vectors, the chief difficulty of the task being that it must | identify and classify arbitrary length content in the OCR. | visarga wrote: | > honestly, not a lot of productive applications come to mind | | Not so convincing when you enumerate so many applications | yourself. | | > but won't be able to say anything about current events | | There are variants that use transformer + retrieval, so they | got unlimited memory that can be easily extended. | gxqoz wrote: | I've mentioned this in another thread, but a GPT-3 that could | reliably generate quizbowl questions like the ones on | https://www.quizbowlpackets.com would be great in this | domain. My experience with it indicates it's no where near | being able to do this, though. | pydry wrote: | Content farms are hardly a productive application. | visarga wrote: | You missed the forest for the trees. If you got a tool that | can use StackOverflow to solve simple programming tasks, or | to generally solve any simple task with Google, then you're | sitting on a gold mine. | notahacker wrote: | That's a big _if_ though. | | GPT-3 is much more _interesting autocomplete based on | most commonly used patterns_ than something which figures | out that Problem X has a lot of conceptual similarities | with Solved Problem Y so it can just reuse the code | example with some different variable names. | jokethrowaway wrote: | Yes and no. | | It may be useful to hire less low skilled employees and | keep a few senior ones that take input from machine and | decide what to keep and what to throw away. I'm not sure | if a senior engineer would be more productive patching up | code written by a bot or writing it from scratch. It's | going to be a hard sell while you still need human | supervisors. | | You can't trust a machine that can't reason with code | implementation, or even content creation. You need a | human to supervise or a better machine. | | We already have AI based auto-completion for code, gpt-3 | can be useful for that (but at what cost? Storing a huge | model on your disk or making a slow / unsafe http request | to the cloud?) | polynomial wrote: | > if a senior engineer would be more productive patching | up code written by a bot or writing it from scratch. | | I have no doubt writing from scratch would win hands | down. The main reason we patch wonky legacy code is | because it's already running and depended on. If you | remove that as a consideration, a senior engineer writing | the equivalent code (rather than debugging code generated | randomly from Google searches) would -IMO- would be more | efficient and produce a higher quality program. | schaefer wrote: | I feel your stance [1] is demonstrably false in two challenges. | | 1) Please play a winning game of Go against Alpha Zero, just by | googling the topic. | | 2) Next please explain how Alpha Zero's game's could forever | change Go opening theory[2], without any genuine creativity. | | [1] that "the output from GPT-3, DALL-E, et al is similar to | what you get from googling the prompt and stitching together | snippets from the top results." | | [2]"Rethinking Opening Strategy: AlphaGo's Impact on Pro Play" | by Yuan Zhou | ravi-delia wrote: | Op was clearly not talking about Alpha Zero, a different | technology made by different people for a different purpose. | Instead, they were noting that despite displaying some truly | excellent world modeling, GPT-3 is _trained_ on data that | encourages it to vomit up rehashes. It 's very possible that | the next generation will overcome this and wind up completely | holding together long-run concepts and recursion, at least if | scaling parameters keeps working, but for now it is a real | limitation. | | GPT-3 writes like a sleepy college student with 30 minutes | before the due date; with shockingly complete grasp of | _language_ , but perhaps not complete understanding of | content. That's not just an analogy, I am a sleepy college | student. When I write an essay without thinking too hard it | displays exactly the errors that GPT-3 makes. | tbenst wrote: | GPT-3 can't play Go. | Tenoke wrote: | It almost definitely can to some extent given that gpt2 | could play chess [0]. | | 0. https://slatestarcodex.com/2020/01/06/a-very-unlikely- | chess-... | rich_sasha wrote: | I think this is exactly right, and indeed this is a lot of the | value. "Content-generation" is already a thing, and yes it | doesn't need to make much sense. Apparently people who read it | don't mind. | throwaway6e8f wrote: | People don't read it, search engines do. | masswerk wrote: | BTW, we should mandatorily tag generated content for search | engines in order to exclude it from future training sets. | hundchenkatze wrote: | Apart from that, hopefully the people building training | sets use gltr or something similar to prevent training on | generated text. | | http://gltr.io/ | ape4 wrote: | Or not even googling, but pre-googling. Using its predictive | typing in the text box at google.com Because you are giving to | something to complete. | Lucasoato wrote: | > I think of the value proposition of GPT-X as "what would you | do with a team of hundreds of people who can solve arbitrary | problems only by googling them?". And honestly, not a lot of | productive applications come to mind. | | Damn, this could replace so many programmers, we're doomed! | throwaway_6142 wrote: | *realizing GPT-3 was probably created by programmers who's | job really is mostly googling for stackoverflow answers* | | #singularity | crdrost wrote: | We were worried that the singularity was going to involve | artificial intelligences that could make themselves | smarter, and we were underwhelmed when it turned out to be | neural networks that started to summarize Stack Overflow | neural network tips, to try to optimize themselves, | instead. | | GPT-[?]: still distinguishable from human advice, but it | contains a quadrillion parameters and nobody knows how | exactly it's able to tune them. | yowlingcat wrote: | Curses. We've been found out. | napier wrote: | >"what would you do with a team of hundreds of people who can | solve arbitrary problems only by googling them?" | | What would you do with a team of hundreds of people who can | instantly access an archive comprising the sum total of | digitized human knowledge and use it to solve problems? | hooande wrote: | We have that now, it's called googling. you could easily hire | 100 people to do that job, but you'd have to pay them at | least $15/hr now on the US. Say equivalent gpt-3 servers cost | a fraction of that. How do you make money with that resource? | eru wrote: | Well, they can use it to write text. Not to solve problems | directly. | orm wrote: | GPT-3 is trained on text prediction, and there's been a lot of | commentary about the generation aspect, but some of the | applications that excite me most are not necessarily about the | generation of text, instead, GPT-3 (and also other language | models) create very useful vector representations of natural | language as a side effect that can then be used for other tasks | with much less data, or with too much extra data. Using the | text prediction task as a way to supervise learning this | representation without having to create an expensive labelled | dataset is very helpful, and not just to language tasks. See | for example the CLIP work that came out recently for image | classification, using GPT-3 and captions to supervise training. | There is other work referred to in that blog post that also | exploits captions or descriptions in natural language to help | understand images better. More speculatively, being able to use | natural language to supervise or give feedback to automated | systems that have little to do with NLP seems very very useful. | hooande wrote: | "More speculatively, being able to use natural language to | supervise or give feedback to automated systems that have | little to do with NLP seems very very useful." | | I agree with this, and it isn't entirely speculative. One of | the most useful applications I have seen that goes beyond | googling is generating css using natural language. ie, | "change the background to blue and put a star at the top of | the page". There are heavily sample selected demos of this on | twitter right now https://twitter.com/sharifshameem/status/12 | 82676454690451457... | | This is definitely practical, though I wouldn't design my | corporate website using this. could be useful if you need to | make 10 new sites a day for something with seo or domains | theonlybutlet wrote: | I work in a niche sector of the insurance industry. Based on | what it can already do, I can see it doing half my job with | basically no learning curve for the user. Based on this alone, | I could see it reducing headcount in the company and sector by | 5%. This is massive when you consider the low margins in the | industry and high costs of "skilled" staff. | mistermann wrote: | > I think of the value proposition of GPT-X as "what would you | do with a team of hundreds of people who can solve arbitrary | problems only by googling them?". And honestly, not a lot of | productive applications come to mind. | | If I was Xi Jinping, I would use it to generate arbitrary | suggestions _for consideration by_ my advisory team, as I | develop my ongoing plan for managing The Matrix. | darepublic wrote: | It's not even commercially available afaik, it's only been made | available to _some_ developers who applied for access, through | a completely opaque process-- this after soliciting for | application ideas from every applicant. So I heartily support | finding a way to truly open source it. | comboy wrote: | Have you tried conversing with it, after a few lines of setting | a proper context? Like two scientist talking or something like | that. It can provide very interesting outputs that are not | googlable. | | Yes, every time you see something that for human obviously | doesn't make sense it makes you dismiss it. You would look at | that output differently though if you were talking with a | child. Just like a child can miss some information making it | say something ridiculous it may miss some patterns connections. | | But have you ever observed carefully how we connect patterns | and make sentences? Our highly sophisticated discussions and | reasoning is just pattern matching. Then most prominent | patterns ordered in time also known as consciousness. | | Watch hackernews comments and look how after somebody used a | rare adjective or cluster of words more commenters tend to use | it without even paying conscious attention to that. | | Long story short, give it a try and see what examples of what | people already did with it even in it's limited form. | | To me you are looking at an early computer and saying that it's | not doing anything that a bunch of people with calculators | couldn't do. | tda wrote: | Did anyone try to ask GPT-3 something about the insurrection? I | can't find any free/open text prompts for GPT-3, this is what | GPT-2 has to say: | | _After Trump lost the 2020 election his supporters stormed the | Capitol_ | | > They smashed windows, tore apart the Republican Party's | headquarters, destroyed a statue of Ronald Reagan and set fire | to the Texas statehouse. | | > A month later, after the Democrats took control of the Senate | and the House, Trump's supporters stormed their own offices. | They attacked the offices of California Democratic Senator | Kamala Harris and New York Democratic Representative Joe | Crowley. They ransacked offices in the Minnesota and Texas | statehouses and burned down the Republican Party headquarters | in Tennessee. The damage was estimated at | throwaway_6142 wrote: | my god it should be called CNN-2 | notahacker wrote: | You can play with GPT3 in a custom world at AIdungeon.io The | responses are biased towards giving you RPG second person | narrative, but the corpus of data, mastery of syntax and more | uncertain grasp of events and relationships is all there | | Example with the prompt _You are Donald Trump. The recent | election results have been a disappointment to you_. | | https://pastebin.com/dSYZypCw | | Props for turns of phrases like "Your opponent is a typical | liberal. He hails from the right wing of the Democratic | Party, but has been trying to appeal to the left to gain more | support.", but poor marks for apparently not having grasped | how elections work. (There's a joke in there somewhere) | | If you don't pick a custom world and your own prompt, you get | something more like this: | | > You are Donald Trump, a noble living in the kingdom of | Larion. You are awakened by a loud noise outside the gate. | You look out the window and see a large number of orcish | troops on the road outside the castle. | | I'd like 'orcish troops' better if I thought it was inspired | by media reports of Capitol events rather than a corpus of | RPGs. | visarga wrote: | GPT-3 | | > The Trump supporters were armed with guns and knives. They | were also carrying torches. They were chanting "Trump 2020" | and "Build the Wall." The Trump supporters were also chanting | "Lock her up." But they were referring to Hillary Clinton. | danmur wrote: | This is hilarious, more please :) | dvfjsdhgfv wrote: | In science, some amazing discoveries are made years or even | centuries before some practical applications for them are | found. I believe in humanity, sooner or later we'll find some | actually useful applications for GPT-X. | Filligree wrote: | It's a wonderful co-writer of fiction, for one. Maybe the | better authors wouldn't need it, but as for everyone else -- | take a look at | https://forums.sufficientvelocity.com/threads/generic- | pawn-t..., and compare the first couple of story posts to the | last few. | | One of the ways in which people get GPT-3 wrong is, they give | it a badly worded order and get disappointed when the output | is poor. | | It doesn't work with orders. It takes a lot of practice to | work out what it does well with. It always imitates the | input, and it's great at matching style -- and it knows good | writing as well as bad, but it can't ever write any better | than the person using it. If you want to write a good story | with it, you need to already be a good writer. | | But it's wonderful at busting writer's block, and at writing | _differently_ than the person using it. | tigerBL00D wrote: | I don't necessarily see the "team of automated googlers" as a | fundamental or damning problem with GPT-like approaches. First | I think people may have a lot fewer truly original ideas then | they are willing to admit. Original thought is sought after and | celebrated in arts as a rare commodity. But unlike in arts, | where there are almost not constraints, when it comes to | science or engineering almost every incremental step is of form | Y = Fn(X0,..,Xn) where X0..Xn are widely known and proven to be | true. With sufficient logical reasoning and/or experimental | data, after numerous peer reviews, we can accept Fn(...) to be | a valid transform and Y becomes Xn+1, etc. Before internet or | Google one had to go to a library and read books and magazines, | or ask other people to find inputs from which new ideas could | be synthesized. I think GPT-like stuff is a small step towards | automating and speeding up this general synthesis process in | the post-Google world. | | But if we are looking to replace end-to-end intelligence at | scale it's not just about synthesis. We need to also automate | the peer review process so that it's bandwidth is matched to | increased rate of synthesis. Most good researchers and | engineers are able to self-critique their work (and the degree | to which they can do that well is really what makes one good | IMHO). And then we rely on our colleagues and peers to review | our work and form a consensus on its quality. Currently GPT- | like systems can easily overwhelm humans with such peer review | requests. Even if a model is capable of writing the next great | literary work, predicting exactly what happened on Jan 6, or | formulating new laws of physics the sheer amount of crap it | will produce alongside makes it very unlikely that anyone will | notice. | Keyframe wrote: | "team of automated googlers" where google is baked-in. Google | results, and content behind it, changes. Meaning, GPT would | have to be updated as well. Could be a cool google feature, a | service. | breck wrote: | I call it the "Prior-Units" theorem. Given that you are able | to articulate an idea useful to many people, there exists | prior units of that idea. The only way then to come up with a | "new idea", is to come up with an idea useful only to | yourself (plenty of those) (or small groups), or translate an | old idea to a new language. | | The reason for this is that if your adult life consists of | just a tiny, tiny, tiny fraction of the total time of all | adults, and so if an idea is relevant to more people, odds | decrease exponentially that no one thought of it before. | | There are always new languages though, so a great strategy is | to take old ideas and bring them to new languages. I count | new high level, non programming languages as new languages as | well. | PaulHoule wrote: | Art (music, literature, ...) involves satisfaction of | constraints. For instance you need to tune your guitar like | the rest of the band, write 800 words like the editor told | you, tell a story with beginning, middle, and end and | hopefully not use the cheap red pigments that were | responsible for so many white, blue, and gray flags I saw in | December 2001. | anticristi wrote: | I love the initiative, but I'm starting to get scared of what a | post-GPT-3 world will look like. We are already struggling to | distinguish fake news from real ones, automated customer request | replies from genuine replies, etc. How will I know that I have a | conversation with a real human in the future? | | On the other side, the prospect of having an oracle that answers | all trivia, fixes spelling and grammar, and allows humans to | focus on higher level information processing is interesting. | visarga wrote: | > How will I know that I have a conversation with a real human | in the future? | | This problem should be solved with cryptography, not by banning | large neural nets. | namelosw wrote: | It's likely to be bad, such as: | | Massively plagiarize articles and the search engine probably | have no way to identify which is the original content. It's | like to rewrite everything on the internet using your own | words, this may lead to the internet filled with this kind of | garbage. | | Reddit and platforms alike filled with bots say bullshits all | the time but hard to identify by the human in the first place | (current model is pretty good at generating metaphysical | bullshits, but rarely insightful content). People may be | surrounded by bot bullshitters and trolls, and very few of them | are real. | | Scams at larger scales. The skillset is essentially like | customer service plus bad intentions. With new models, scammers | can do their things at scale and find qualified victims more | efficiently. | pixl97 wrote: | >(current model is pretty good at generating metaphysical | bullshits, but rarely insightful content) | | Wait, are we talking about bots posting crap, or the average | political discussion? | gianlucahmd wrote: | We already live in a post-GPT3 world, but one where all its | power is the hands of a private company. | | The conversation needs to move on whether making it open and | democratic is a good idea, but the tech itself is here to stay. | dkjaudyeqooe wrote: | I know! One day it's going to get so bad people are going to | have to deploy critical thinking instead of accepting what they | read at face value and suffer the indignity of having to think | for themselves. | _underfl0w_ wrote: | Maybe that'll also be the Year of the Linux desktop I keep | hearing so much about. | Lewton wrote: | Will they also learn to avoid magical thinking like "People | en masse will all of a sudden develop abilities out of the | blue"? | Kuinox wrote: | Sadly they don't. They start to believe to random things in | and reject what they don't like. | gmueckl wrote: | Critical thinking won't help you when the majority (or all) | of your sources are tainted and contradictory. At some point, | the actual truth just gets swamped. | inglor_cz wrote: | This. Robots can spout 1000x more content than humans, if | not more. | spion wrote: | This is already happening, just with humans networked into | social networks that favor quick reshare over deep review | thelastwave wrote: | "the actual truth" | hooande wrote: | 10,000 years of human civilization and this hasn't happened | yet, huh? Any day now, I'm sure | emphatizer2000 wrote: | Maybe somebody could create an AI that evaluates the | factfulness of articles. | visarga wrote: | This is possible, especially with human in the loop. | lrossi wrote: | Or have the AI generate only fact-based, polite and | relevant comments. | | Related xkcd: https://xkcd.com/810/ | realusername wrote: | I'm going to throw some wild guess here and say that this | sudden increase in critical thinking won't happen. | Erlich_Bachman wrote: | Photoshop has existed for decades. Is it really that big of a | problem for photo news? | technocratius wrote: | The difference between Photoshop and generative models is not | in what it can technically achieve, but the cost of achieving | the desired result. Fake news photo or text generation is | possible by humans, but scales poorly compared (more humans) | to a algorithmically automated process (some more compute). | draugadrotten wrote: | Yes! | | https://en.wikipedia.org/wiki/Adnan_Hajj_photographs_controv. | .. https://www.bbc.com/news/world-asia-china-55140848 | | and many more | Agentlien wrote: | Wow, that first Adnan Hajj photograph looks absolutely | terrible. | danielscrubs wrote: | Touch ups where done before photoshop but now it's ALWAYS | done. The issues this has created in society might have a | bigger emotional impact than we give it credit for. | | Regarding photo news there has been quite a lot of scandals | to the point that I'd guess the touchups is more or less | accepted. | Pyramus wrote: | I conducted a workshop in media compentency for teenage | girls and one of the key learnings was that _every_ image | of a female subject they encounter in printed media (this | was before Instagram) has been retouched. | | To hammer the point home I let them retouch a picture by | themselves to see what is possible even for a completely | untrained manipulator. | | It was eye-opening - one of the things that should | absolutely be taught in school but isn't. | IndySun wrote: | "one of the things that should absolutely be taught in | school but isn't." | | Namely, critical thinking? | darkwater wrote: | I don't think "critical thinking" is the point here. | Because first you need to know that such modifications | CAN be done. And not everybody knows what can be | retouched with PS or programs. So yeah, if you see some | super-model on a magazine cover, and you don't know PS | can edit photos easily, it would be not that immediate to | think "hey maybe that's not real!". | | As an extreme example: would you ever checked 20 years | ago a newspaper text to know if it was generated by an AI | or by a human? Obviously no, because you didn't know of | any AI that could do that. | Pyramus wrote: | Exactly this. | | There is a secondary aspect of becoming aware that | society has agreed on beauty standards (different for | different societies) and PS being used as a means to | adhere to these standards. | IndySun wrote: | I think I made my point badly because I also agree. | | I am lamenting that teenagers were, in this day and age, | surprised at what can be done with Photoshop. And that | let loose on the appropriate software were surprised at | what can be altered and how easily. | | My point is suggesting this may be so because people have | not been taught how to think for themselves and accept | things (in this case female images) 'as is', without a | hint of curiosity. It is also a problem but at the other | end of the stick, with many young people I work with | considering Wikipedia to be 100% full of misinformation | and fake news. | Rastonbury wrote: | The concern is not so much AI generated news, but malicious | actors misleading, influencing or scamming people online at | scale with realistic conversations. Today we already have | large scale automated scams via email and robo call. Less | scalable scams like Tinder love catfish scams or | Russian/China trolls on reddit are now run by real people, | imagine it being automated. If human moderators cannot | distinguish these bots from real humans, that is a scary | thought, imagine not being able to tell if this comment was | written by a human or robot. | hooande wrote: | why does this matter? the internet is filled with millions | of very low quality human generated discussions right now. | There might not be much of a difference between thousands | of humans generating comment spam and thousands of gpt-3 | instances doing the same | mekkkkkk wrote: | It does matter. The nice feeling of being one of many is | a feature of echo chambers. If you can create that | artificially for anything with a push of a button, it's a | powerful tool to edit discourse or radicalize people. | | Have a look at Russias interference in the previous US | election. This is what they did, but manually. To be able | to scale and automate it is huge. | ccozan wrote: | But careful, the human psyche has some kind of tipping | point. Too much fake news, and it will flip. Too less, no | real influence is made. | | The exact balance must be orchestrated by a human. | visarga wrote: | > imagine not being able to tell if this comment was | written by a human or robot | | I think neural nets could help finding fake news and | factual mistakes. Then it wouldn't matter who wrote it if | it is helpful and true. | Yetanfou wrote: | That is like saying "black powder has existed for centuries, | are nuclear weapons really that big a problem?". The | difference between an image editor like Photoshop and an | automated image generation program is the far greater | production capacity, speed, the lower cost and the fact that | anyone with the right equipment can use it whereas the end | result of of an image editor only is as good as the person | using it, | turing_complete wrote: | Don't read news. Go to original sources and scientific papers. | If you really want to understand something, a news website | should only be your starting point to look for keywords. That | is true today as it will be "post-GPT-3". | wizzwizz4 wrote: | > _Go to original sources and scientific papers._ | | Given how much bunk "science" (and I'm talking things | completely transparent to someone remotely competent in the | field) gets published, especially in psychology, it's | difficult to do even that. | turing_complete wrote: | You are right. You still have to read critically or find | trusted sources, of course. | yread wrote: | Primary sources need to be approached with caution | https://clas.uiowa.edu/history/teaching-and-writing- | center/g... | gmueckl wrote: | And so does every other source. You can play that analysis | game with any source material. The problem is that the | accuracy and detail of the reporting usually fades with | each step towards mass media content. | savolai wrote: | This scales badly today and will scale even worse in the | future. Those without education or time resources will _at | best_ manage to read the news. Humanity will need a low | effort way to relay reliable information to more of it 's | members. | anticristi wrote: | Besides the time scalability aspect highlighted by someone | else, I am worried that GPT-3 will have the potential to | produce even "fake scientific papers". | | Our trust fabric is already quite fragile post-truth. GPT-3 | might make it even more fragile. | op03 wrote: | I wouldn't worry about the science bit. No one worries | about the university library getting larger and larger or | how its going to confuse or misguide people even though | everyone knows there are many books in there, full of | errors, outdated information, badly written, boring etc etc | etc. | | Why? Cause there is always someone on campus who knows | something about a subject to guide you to the right stuff. | cbozeman wrote: | Most people are not smart enough to do this, and even if they | are, they don't have enough time in their day. | hntrader wrote: | That's what people _should_ do, and that 's what you and I | will do, but many won't, especially the less educated (no | condescension intended). They'll buy into the increased | proliferation of fake info. It's because of these people that | I think the concerns are valid. | anticristi wrote: | Honestly, I consider myself fairly educated (I have a PhD | in CS), but if the topic at hand is sufficiently far from | my core competence, then reading the scientific article | won't help. I keep reading about p-value hacking, subtle | ways of biasing research, etc., and I realize that, to | validate a scientific article, you have to be a domain | expert and constantly keep up-to-date with today's best | standards. Given the increasing number of domains to be an | expert in, I fail to see how any single human can achieve | that without going insane. :D | | I mean, Pfizer could dump their clinical trial reports at | me, and I would probably be unable to compute their | vaccine's efficiency, let alone find any flaws. | taneq wrote: | The fake news thing is a real problem (and may become worse | under GPT3 but certainly exists already). As for the others - | to quote Westworld, "if you can't tell the difference, does it | really matter?" | phreeza wrote: | Genuine question, why is this a problem? Sure, someone may be | able to generate thousands of real-sounding fake news | articles, but it's not like they will also be able to flood | the New York Times with these articles. How do you worry you | will be exposed to these articles? | chimprich wrote: | It's not me I'm worried about - it's the 50% [1] of people | who get their news from social media and "entertainment" | news platforms. These people vote, and can get manipulated | into performing quite extreme acts. | | At the moment a lot of people seem to have trouble engaging | with reality, and that seems to be caused by relatively | small disinformation campaigns and viral rumours. How much | worse could it get when there's a vast number of realistic- | sounding news articles appearing, accompanied by realistic | AI-generated photos and videos? | | And that might not even be the biggest problem. If these | things can be generated automatically and easily, it's | going to be very easy to dismiss real information as fake. | The labelling of real news as "fake news" phenomenon is | going to get bigger. | | It's going to be more work to distinguish what is real from | what is fake. If it's possible to find articles supporting | any position and a suspicion that any contrary new is then | a lot of people are going to find it easier to just believe | what they prefer to believe... even more than they do now. | | [1] made-up number, but doesn't feel far off. | profunctor wrote: | That fact that you made up that number is extremely funny | in this context. | chimprich wrote: | I don't think so - I was aware that it was a made-up | number, and highlighted the fact that it was. It's the | lack of awareness of what is backed up by data that is | the problem I think. | | Or am I missing your point? | _underfl0w_ wrote: | Right, it's definitely goof that you cited it being fake, | but I think the parent was pointing out the subtle (and | likely unintentional) irony of discussing fake news while | providing _fake_ numbers to support your opinion. | jokethrowaway wrote: | The majority of "fake news" are factual news described | from a partial point of view and with a political spin. | | Even fact checkers are not immune to this and brand other | news as true or false not based on facts but based on the | political spin they favour. | | Fake news is a vastly overstated problem. Thanks to | internet, we now have a wider breadth of political news | and opinions and it's easy to label everything-but-your- | side as fake news. | | There are a few patently false lies on the internet which | are taken as examples of fake news - but they have very | few supporters. | mistermann wrote: | > There are a few patently false lies on the internet | which are taken as examples of fake news - but they have | very few supporters. | | Very true. What's interesting though is how many | supporters there of the idea that extremely large | quantities of people 100% "hook, line, and sinker" buy | into such fake news stories - based, ironically, on | _fake-news-like_ articles assuring us (with specious | evidence, if any) that this is the true state of reality. | | The world is amazingly paradoxical if you look at it from | the proper abstract angle. | chimprich wrote: | > Even fact checkers are not immune to this and brand | other news as true or false not based on facts but based | on the political spin they favour. | | Could you give an example? | | > There are a few patently false lies on the internet | which are taken as examples of fake news - but they have | very few supporters. | | How many do you consider "few"? | | I can go to my local news site and read a story about the | novel coronavirus and the _majority_ of comments below | the article are stating objectively false facts. | | "It's just a flu" "Hospitals are empty" "The survival | rate is 99.9%" "Vaccines alter your DNA" | | ...and so on. | | There is the conspiracy theory or cult called QAnon, | which "includes in its belief system that President Trump | is waging a secret war against elite Satan-worshipping | paedophiles in government, business and the media." | | One QAnon Gab group has more than 165,000 users. I don't | think these are small numbers. | perpetualpatzer wrote: | > made-up number, but doesn't feel far off. | | Pew Research says 18% report getting news primarily from | social media (fielded 10/19-6/20)[0]. November 2019 | research said 41% among 18-29 year olds, which was the | peak age group. Older folks largely watch news on TV[1]. | | [0] https://www.journalism.org/2020/07/30/americans-who- | mainly-g... [1] https://www.pewresearch.org/pathways-2020 | /NEWS_MOST/age/us_a... | chimprich wrote: | Thanks for providing data. Evidence is better than making | up numbers. | hnlmorg wrote: | If recent times have told us anything, it's that the | biggest distributor of "news" is social media. And worse | still, people generally have no interest in researching the | items they read. If "fake news" confirms their pre-existing | bias then they will automatically believe it. If real news | disagrees with their biases then it is considered fake. | | So in theory, the rise of deep fakes could lead to more | people getting suckered into conspiracy theories and other | such extreme opinions. We've already seen a small trend | this way with low resolution images of different people | with vaguely similar physical features because used as | "evidence" of actors in hospitals / shootings / terrorist | scenes / etc. | | That all said, I don't see this as a reason not to pursue | GPT-3. From that regard the proverbial genie is already out | of the bottle. What we need to work on is a better | framework for distributing knowledge. | xerxespoy wrote: | Journalists are paid by the word. | qayxc wrote: | > "if you can't tell the difference, does it really matter?" | | It indeed does. The problem is that societies and cultures | are heavily influenced and changed by communication, media, | and art. | | By replacing big portions of these components with artificial | content, generated from previously created content, you run | the risk of creating feedback cycles (e.g. train future | systems from output of their predecessors) and forming | standards (beauty, aesthetics, morality, etc.) controlled by | the entities that build, train, and filter the output of the | AIs. | | You'll basically run the risk of killing individuality and | diversity in culture and expression; consequences on society | as a whole and individual behaviour are difficult to predict, | but seeing how much power social media (an unprecedented | phenomenon in human culture) have, there's reason to at the | very least be cautious about this. | visarga wrote: | This problem affects all types of agents - natural or | artificial. Agent acts in the environment, this causes | experience and learning, and thus conditioning the future. | The agent has no idea what other opportunities are lost | behind past choices. | notahacker wrote: | Most human communications between humans have some physical | world purpose, and so an algorithm which is trained to | _create the impression_ that a purpose has been fulfilled | whilst not actually having any capabilities beyond text | generation is going to have negative effects except where the | sole purpose of interacting is receiving satisfactory text. | | Reviews that look just like real reviews but are actually a | weighted average of comments on a different product are | negative. Customer service bots that go beyond FAQ to do a | very convincing impression of a human service rep promising | an investigation into an incident but can't actually start an | investigation into the incident are negative. An information | retrieval tool which has no information on a subject but can | spin a very plausible explanation based on data on a | different subject is negative. | | Of course, it's entirely possible for humans to bullshit, but | unlike text generation algorithms it isn't our default | response to everything. | skybrian wrote: | If you ask GPT-3 for three Lord of the Rings quotes it might | give you two real ones and one fake one, because it doesn't | know what truth is and just wants to give you something | plausible. | | There are creative applications for bullshit, but something | that cites its sources (so you can check) and doesn't | hallucinate things would be much more useful. Like a search | engine. | Drakim wrote: | What scares me personally is the idea that I might be | floating in a sea of uncanny valley content. Content that's | 98% human-like, but then that 2% sticks out like a nail and | snaps me out of it. | | Sure, I might not be able to tell the difference the majority | of the time, but when I can tell the difference it's gonna | bother me a lot. | fakedang wrote: | To me, a lot of content seems to be digital marketing | horseshit tbh. | mistermann wrote: | Do you not already have this feeling on a fairly regular | basis? (Serious question) | normanmatrix wrote: | You will not. Welcome to the scary generative future. | anticristi wrote: | I was hoping for a "yes, we can" attitude here. :D | Agentlien wrote: | Deep fakes still feel quite uncanny valley to me. Even if they | move beyond that convincing fake images have existed for a long | while. | | As for support, I don't really see why it matters if I'm | talking to a clever script or an unmotivated human. | falcor84 wrote: | > already struggling to distinguish ... automated customer | request replies from genuine replies | | I hope it's not only due to a decline in the quality of human | support. If we could have really useful automated support | agents, I for one would applaud that. | anticristi wrote: | I agree. As long as it is transparent that I am speaking to | an automated agent and I can easily escalate the issue to a | human that can solve my problem when the agent gets stuck. | bencollier49 wrote: | We'll go full circle and you'll be forced to meet people in | person again. | Erlich_Bachman wrote: | It's a shame that it has turned out to be necessary to externally | re-make and re-train a model that has come out of company called | `OPEN`AI. Wasn't one of the founding principles of it that all of | the research would be available to the public? Isn't that the | premise on which the initial funding was secured? Best of luck to | Eleuther. | mhuffman wrote: | But I was told GPT-3 was too powerful for mere mortal hands | (unless you have an account!) and that it would be used for | hate speech and to bring about skynet. | | How will this project avoid those terrible outcomes? | visarga wrote: | By putting the cat back in the bag. Oh, it's too late ... | useless to think about it - we can't de-invent an invention | or stop people from replicating. Its like that time when NSA | wanted to restrict crypto. | dvfjsdhgfv wrote: | I don't know a single intelligent person who believed this | argument, it simply doesn't hold up. | thelastwave wrote: | Lots of people "believe" that, they just prefer to downvote | anonymously rather than try to defend their position. | ForHackernews wrote: | New research has revealed that intelligence is not a | prerequisite for generating hate speech on social media | platforms. | visarga wrote: | It was probably bait and switch to hire top researchers and get | initial funding. Now that OpenAI is a household name, they | don't have to pretend anymore. | b3kart wrote: | I buy the former, researchers might be happier knowing their | work potentially benefits all of humanity, not just a bunch | of investors. But wouldn't it be _more_ difficult to get | funding as a non-profit? | littlestymaar wrote: | It's just never going to be difficult to get funding when | you have Elon Musk and Sam Altman as founders (and even | more so when founders put one billion of their own money | into it). | b3kart wrote: | Sure, but that's OpenAI's particular set of | circumstances. Generally speaking I struggle to see | investors preferring a nebulous non-profit over a for- | profit with a clear path to market. | littlestymaar wrote: | Sure, but we're explicitly talking about OpenAI here. | b3kart wrote: | Of course. It's just that the comment I've been | responding to suggested OpenAI going the "open"/non- | profit route was to 1) get top researchers and 2) get | investment. I was arguing that this doesn't seem to | (generally) be a good way to get investment, but I agree | with you in that in their case investment just wasn't a | consideration at all. | spiderfarmer wrote: | I don't really care if OpenAI offers commercial licenses as | long as the underlying research is truly open. This way | alternative options will become available eventually. | querez wrote: | Arguably openAI is one of the most closed industry AI labs | (among those that are still participating in the research | community), on par only with deep mind (though deepmind at | least publishes way more). Funnily enough, FAIR and Google | Brain have a vastly better track record wrt. publishing not | only papers but also code and models. | dave_sullivan wrote: | Really. OpenAI assembled some of the best minds from the deep | learning community. The problem isn't that they are a for- | profit SaaS, the problem is they lied. | thelastwave wrote: | And ended up making an AI service that's really good at... | lying. | Sambdala wrote: | Wild-Ass Guess (Ass-Guess) incoming: | | OpenAI was built to influence the eventual _value chain_ of AI | in directions that would give the funding parties more | confidence that their AI bets would pay off. | | This value chain basically being one revolving around AI as | substituting predictions and human judgement in a business | process, much like cloud can be (oversimply) modeled as moving | Capex to Opex in IT procurement. | | They saw that, like any primarily B2B sector, the value chain | was necessarily going to be vertically stratified. The output | of the AI value chain is as an input to another value chain, | it's not a standalone consumer-facing proposition. | | The point of OpenAI is to invest/incubate a Microsoft or Intel, | not a Compaq or Sun. | | They wanted to spend a comparatively small amount of money to | get a feel for a likely vision of the long-term AI value chain, | and weaponize selective openness to: 1) establish moats, 2) | Encourage commodification of complementary layers which add | value to, or create an ecosystem around, 'their' layer(s), and | 3) Get insider insight into who their true substitutes are by | subsidizing companies to use their APIs | | As AI is a technology that largely provides benefit by | modifying business processes, rather than by improving existing | technology behind the scenes, your blue ocean strategy will | largely involve replacing substitutes instead of displacing | direct competitors, so points 2 and 3 are most important when | deciding where to funnel the largest slice of the funding pie. | | _Side Note: Becoming an Apple (end-to-end vertical integration) | is much harder to predict ahead of time, relies on the 'taste' | and curation of key individuals giving them much of the | economic leverage, and is more likely to derail along the way._ | | They went non-profit to for-profit after they confirmed the | hypothesis that they can create generalizeable base models that | others can add business logic and constraints to and generate | "magic" without having to share the underlying model. | | In turn, a future AI SaaS provider can specialize in tuning the | "base+1" model, then selling that value-add service to the | companies who are actually incorporating AI into their business | processes. | | It turned out, a key advantage at the base layer is just brute | force and money, and further outcomes have shown there doesn't | seem to be an inherent ceiling to this; you can just spend more | money to get a model which is unilaterally better than the last | one. | | There is likely so much more pricing power here than cloud. | | In cloud, your substitute (for the category) is buying and | managing commodity hardware. This introduces a large-ish | baseline cost, but then can give you more favorable unit costs | if your compute load is somewhat predictable in the long term. | | More importantly, projects like OpenStack and Kubernetes have | been desperately doing everything to commodotize the base layer | of cloud, largely to minimize switching costs and/or move the | competition over profits up to a higher layer. You also have | category buyers like Facebook, BackBlaze, and Netflix investing | heavily into areas aimed at minimizing the economic power of | cloud as a category, so they have leverage to protect their own | margins. | | It's possible the key "layer battle" will be between the | hardware (Nvidia/TPUs) and base model (OpenAI) layers. | | It's very likely hardware will win this for as long as they're | the bottleneck. If value creation is a direct function of how | much hardware is being utilized for how long, and the value | creation is linear-ish as the amount of total hardware scales, | the hardware layer just needs to let a bidding war happen, and | they'll be capturing much of the economic profit for as long as | that continues to be the case. | | However, the hardware appears (I'm no expert though) to be | something that is easier to design and manufacture, it's mostly | a capacity problem at this point, so over time this likely gets | commoditized (still highly profitable, but with less pricing | power) to a level where the economic leverage goes to the Base | model layer, and then the base layer becomes the oligopsony | buyer, and the high fixed investment the hardware layer made | then becomes a problem. | | The 'Base+1' layer will have a large boom of startups and | incumbent entrants, and much of the attention and excitement in | the press will be equal parts gushing and mining schaudenfreude | about that layer, but they'll be wholly dependent on their | access to base models, who will slowly (and deliberately) look | more and more boring apart from the occasional handwringing | over their monopoly power over our economy and society. | | There will be exceptions to this who are able to leverage | proprietary data and who are large enough to build their own | base models in-house based on that data, and those are likely | to be valuable for their internal AI services preventing an | 'OpenAI' from having as much leverage over them and being much | better matched to their process needs, but they will not be as | generalized as the models coming from the arms race of | companies who see that as their primary competitive advantage. | Facebook and Twitter are two obvious ones in this category, and | they will primarily consume their own models, rather than | expose them as model-as-a-service directly. | | The biggest question to me is whether there's a feedback loop | here which leads to one clear winning base layer company | (probably the world's most well-funded startup to date due to | the inherent upfront costs and potential long-term income), or | if multiple large, incumbent tech companies see this as an | existential enough question that they more or less keep pace | with each other, and we have a long-term stable oligopoly of | mostly interchangeable base layers, like we do in cloud at the | moment. | | Things get more complex when you look to other large investment | efforts such as in China, but this feels like a plausible | scenario for the SV-focused upcoming AI wars. | visarga wrote: | Apparently you don't need to be a large company to train | GPT-3. EleutherAI is using free GPU from CoreWeave, the | largest North American GPU miner, who agreed to this deal to | get the final model open sourced and have their name on it. | They are also looking at offering it as an API. | Sambdala wrote: | I think it's great they're doing this, but GPT-3 is the | bellwether not the end state. | | Open models will function a lot like Open Source does | today, where there are hobby projects, charitable projects, | and companies making bad strategic decisions (Sun open | sourcing Java), but the bulk of Open AI (open research and | models, not the company) will be funded and released | strategically by large companies trying to maintain market | power. | | I'm thinking of models that will take $100 million to $1 | billion to create, or even more. | | We spend billions on chip fabs because we can project out | long term profitability of a huge upfront investment that | gives you ongoing high-margin capacity. The current | (admittedly early and noisy) data we have about AI models | looks very similar IMO. | | The other parallel is that the initial computing revolution | allowed a large scale shift of business activities from | requiring teams of people doing manual activities, | coordinated by a supervisor towards having those functions | live inside a spreadsheet, word processor, or email. | | This replaces a team of people with (outdated) | specializations with fewer people accomplishing the same | admin/clerical work by letting the computer do what it's | good at doing. | | I think a similar shift will happen with AI (and other | technologies) where work done by humans in cost centers is | retooled to allow fewer people to do a better job at less | cost. Think compliance, customer support, business | intelligence, HR, etc. | | If that ends up being the case, donating a few million | dollars worth of GPU time doesn't change the larger trends, | and likely ends up being useful cover as to why we | shouldn't be worried about what the large companies are up | to in AI because we have access to crowdsourced and donated | models. | jariel wrote: | This is neat, but almost no startups of any kind, even mid | size corps, have such complicated and intricate plans. | | More likely: OpenAI was a legit premise, they started to run | out of money, MS wanted to license and it wasn't going to | work otherwise, so they just took the temperature with their | initial sponsors and staff and went commercial. | | And that's it. | ccostes wrote: | I think calling this a "wild-ass guess" undersells it a bit | (either that or we have very different definitions of a | WAG).Very well though-through and compelling case. | | My biggest question is whether composable models are indeed | the general case, which you say they confirmed as evidenced | by the shift away from non-profit. It's certainly true for | some domains, but I wonder if it's universal enough to enable | the ecosystem you describe. | wraptile wrote: | OpenAI turning out to be a total bait and switch. Especially | true when your co-founder is actively calling you out on it[1] | | Remember kids: if it's not a non-profit organization it is a | _for_ profit one! It was silly to expect anything else: | | > In 2019, OpenAI transitioned from non-profit to for-profit. | The company distributed equity to its employees and partnered | with Microsoft Corporation, who announced an investment package | of US$1 billion into the company. OpenAI then announced its | intention to commercially license its technologies, with | Microsoft as its preferred partner [2] | | 1 - https://edition.cnn.com/2020/09/27/tech/elon-musk-tesla- | bill... | | 2 - https://en.wikipedia.org/wiki/OpenAI | person_of_color wrote: | So OpenAI employees get Microsoft RSUs? | unixhero wrote: | What is an RSU? | ourcat wrote: | Restricted Stock Units | agravier wrote: | It means restricted stock unit, and it's a kind of | company stock unit that may be distributed to some | "valued" employees. There is usually a vesting schedule, | and you can't do whatever you want with it. | garmaine wrote: | Why would they? It's a separate company. | dvfjsdhgfv wrote: | It will be interesting to see the attitude of Microsoft | towards this project in the light of their "Microsoft loves | open source" propaganda. | eeZah7Ux wrote: | Like many other companies, Microsoft loves unpaid labor. | | Free Software is about giving freedom and security all the | way to the end users - rather than SaaS providers. | | If you remove this goal and only focus on open source as a | development methodology you end up with something very | similar to volunteering for free for some large | corporation. | Closi wrote: | I don't know where people got this idea that Microsoft | can't participate positively in Open Source, and do that | sincerely, without open sourcing absolutely everything. | | Of course they can - just because you contribute to open | source, and do that because you also benefit from open | source projects, doesn't mean you have to do absolutely | everything under open source. | | Especially considering OpenAI isn't even Microsoft's IP or | codebase. | taf2 wrote: | How about when Steve Ballmer said something along the | lines of | | "Linux is a cancer that attaches itself in an | intellectual property sense to everything it touches" | | Pretty sure that is hostile towards open source? Linux | being one of the flagship projects of open source. | | [edit] source https://www.zdnet.com/article/ex-windows- | chief-heres-why-mic... | Closi wrote: | It's hostile to the GPL licence which means anything | licensed under GPL can't be used in Microsoft's | proprietary products. | | I would personally say Microsoft wasn't necessarily | driven by anti open-source hate necessarily, they were | just very anti-competitor. Microsoft tried to compete | with their biggest competitor? Colour me shocked. | Daho0n wrote: | I don't think this should be seen in the light of "open | source everything" but more that many see Microsoft doing | open source not as part of "being good" but part of their | age old "embrace extend extinguish" policy. | dvfjsdhgfv wrote: | > I don't know where people got this idea that Microsoft | can't participate positively in Open Source, and do that | sincerely, without open sourcing absolutely everything. | | I'm not claiming that. Of course there is place for | closed and open elements of their offerings. Let me | clarify. | | In the past, Microsoft was very aggressive about open | source. When they realized this strategy of FUD brings | little result, they changed their attitude 180 and | decided to embrace it putting literal hearts everywhere. | | Personally, I find it hypocritical. There is no | love/hate, just business. They will use whatever strategy | works to get their advantage. What I find strange is that | people fell for it. | Closi wrote: | But why on this thread then, about GPT-3? It's not even | their own company, IP or source to give away. | | But even when Microsoft _can 't_ open source it because | it's _not theirs_ , we _still_ have people posting in | this thread that this is further evidence that Microsoft | is hypocritical. It sounds a lot like a form of | Confirmation Bias to me where any evidence is used as | proof that Microsoft is 'anti-open-source'. | taf2 wrote: | I think it is because each model from OpenAi was public | until Microsoft became an investor. | [deleted] | pessimizer wrote: | I don't know where people got the idea that companies can | be "sincere." Sincerity is faithfully communicating your | mental state to others. A company's mental state can | change on a dime based on the decisionmaking of people | who rely on the company for the degree of profit it | generates. Any analog to sincerity that you think you see | can probably be eliminated by firing one person after an | exceptionally bad quarter (or an exceptionally good one.) | Closi wrote: | Sincere to me just means that you are being truthful, or | not trying to be deceptive. | | And I think companies can be sincere - because companies | are really just groups of people and assets when you get | down to the nuts and bolts of it. | eeZah7Ux wrote: | > companies can be sincere | | "sincere", "honest", "hypocritical" usually refers to a | long-term pattern. Being able to be sincere from time to | time is besides the point. | | > companies are really just groups of people | | ...with profit as their first priority. | | For-profit companies "can be sincere" only as long as | it's the most profitable strategy. | JacobiX wrote: | It's a recurring theme in OpenAI research, they become more and | more closed. For instance their latest model called DALL*E hit | the headlines before the release of the paper. Needless to say, | the model is not available and no online demo has been | published so far. | cbozeman wrote: | Because its winner-take-all in this research, not "winner- | take-some". | | Andrew Yang talked about this and why breaking up Big Tech | won't work. No one wants to use the second best search | engine. The second best search engine is Bing and I almost | never go there. | | Tech isn't like automobiles, where you might prefer a Honda | over a Toyota, but ultimately they're interchangeable. A | Camry isn't dramatically different and doesn't perform | dramatically better than an Accord. Whoever builds the best | AI "wins" and wins totally. | visarga wrote: | But they still released the CLIP model which is the | complement of DALL-E and used in the DALL-E pipeline as a | final filter. There are collabs with CLIP floating around and | even a web demo. | JacobiX wrote: | Thank you for this info, as you mentioned CLIP is used for | re-ranking DALL-E outputs, by itself it is just an (image, | text) pairs classification network. | Tenoke wrote: | The research is open to the public. Here's the gpt3 paper | https://arxiv.org/abs/2005.14165 | | Also gpt2 models and code at least were publicly released and | so has a lot of their work. | | And yes, they realized they can achieve more by turning for | profit and partnering with Microsoft. So true, they are not | fully 'open' but pretending they don't release things to the | public and making the constant 'more like closedai aimirite' | comments is getting old. | avivo wrote: | I'd love to see an equal amount of the effort put toward | initiatives like this, also being put toward mitigating their | _extremely likely_ negative societal impacts (and putting in | safeguards). | | Of course, that's not nearly as sexy. | | Yes, there are lots of incredible positive impacts of such | technology, just like there was with fire, or nuclear physics. | But that doesn't mean that safeguards aren't _absolutely | critical_ if you want it to be net win for society. | | These negative impacts are not theoretical. They are obvious and | already a problem for anyone who works in the right parts of the | security and disinformation world. | | We've been through all this before... | https://aviv.medium.com/the-path-to-deepfake-harm-da4effb541... | | Of course, some of the same people who ignored recommendations[1] | for harm mitigations in visual deepfake synthesis tools (which | ended up being used for espionage and botnets) seem to be working | on this. | | [1] e.g. | https://www.technologyreview.com/2019/12/12/131605/ethical-d... | mrfusion wrote: | It still baffles me that GPT turned out to be more than a | glorified markov chain text generator. It seems we've actually | made it create a model of the world to some degree. | | And we kind of just stumbled on the design by throwing massive | data and neural networks together? | nullc wrote: | You're made of _meat_ and yet you manage to be more than a | glorified markov chain generator. :) | | (I hope) | Filligree wrote: | It turns out that brute-force works, and the scaling curve is | _still_ not bending. | | I doubt we'll ever see a GPT-4, because there are known | improvements they could make besides just upsizing it further, | but that's besides the point. If that curve doesn't bend soon | then a 10x larger network would be human-level in many ways. | | (Well, that is to say. It's actually bending. Upwards.) | hntrader wrote: | What % of all digitized and reasonably easy-to-access text | data did they use to train GPT-3? I'm wondering whether the | current limits on GPT-n are computation or data. | kortex wrote: | > As per the creators, the OpenAI GPT-3 model has been | trained about 45 TB text data from multiple sources which | include Wikipedia and books. | | It's about 400 B tokens. Library if Congress is about 40M | books, let's say 50K tokens per book, or about 2T tokens. | Not necessarily unique. | | I would say it's plausible that it was a decent percent of | the indexed text available, and even more of the unique | content. GPT2 was 10B tokens. Do we have 20T tokens | available for GPT4? Maybe. But the low hanging fruit are | definitely plucked. | mrfusion wrote: | So fascinating. I'd love to understand why it's working so | well. I guess no one knows. | | Wouldn't gpt4 just be more data and more parameters? | nemoniac wrote: | Good initiative but tell us more about the governance. After all | OpenAI was "open" until it was bought by Microsoft. | wizzwizz4 wrote: | No, it wasn't. And iirc, only GPT-3 was. | joshlk wrote: | How does the outfit intend to fund the project? OpenAI spends | millions on computing resources to train the models. | jne2356 wrote: | The cloud company CoreWeave has agreed to provide the GPU | resources necessary. | stellaathena wrote: | Hey! One of the lead devs here. A cloud computing company | called CoreWeave is giving us the compute for free in exchange | for us releasing it. We're currently at the ~10B scale and are | working on understanding datacenter scale parallelized training | better, but we expect to train the model on 300-500 V100s for | 4-6 months. | tmalsburg2 wrote: | I imagine recreating the model will be computationally cheaper | because they will not have to sift through the same huge | hyperparameter space as the initial GPT-3 team had to. | thelastwave wrote: | Why is that? | jne2356 wrote: | This is not true. The OpenAI team only trained one full-sized | GPT-3, and conducted their hyperparameter sweep on | significantly smaller models (see: | https://arxiv.org/abs/2001.08361). The compute savings from | not having to do the hyperparameter sweep are negligible and | do not significantly change the feasibility of the project. | 2Gkashmiri wrote: | So how much money would it take to rebuild this foss alternative | ? And distributive power like seti@home? If it can be done and I | hope it does, what benefit would the original proprietary one | have over this? Licensing? | astrange wrote: | OpenAI will execute the original one for you. If you can get an | account, anyway. | jne2356 wrote: | EleutherAI has already secured the resources necessary. | | They get the seti@home suggestion a lot. There's a section in | their FAQ that explains why it's infeasible. | | https://github.com/EleutherAI/info | pjfin123 wrote: | What does the future of open-source large neural nets look like? | My understanding is GPT-3 takes ~600GB of GPU memory to run | inference. Does an open source model just allow you a choice of a | handful of cloud providers instead of one? | aabhay wrote: | Open source doesn't mean that everyone will be rolling their | own. It means that lots of players will start to offer | endpoints with GPT-X, perhaps bundled with other services. It | is good for the market. | mirekrusin wrote: | I'd gladly contribute (power and) few of idle GTX cards I have to | public peer/volunteer/seti@home-like project if result | snapshot(s) are available publicly/to registered, active | contributors. | Voloskaya wrote: | SETI@home style distributed computation is not suitable for | training something like GPT-3, unlike for SETI, the unit of | work a node can do before needing to share it's output with the | next node is really small, so very fast interconnect between | the nodes is needed (Infiniband and NVLink is used in clusters | to train it). It would probably take a decade to train such a | model over regular internet. | mirekrusin wrote: | Are there any models/research optimised on working on this | kind of small, distributed batches that would fit ie. ~10GB | of commodity GPU? | mitjam wrote: | Maybe a case for a community colocation cloud where I a | consumer can buy a system and colocate it in a large data | center with great internal networking. Edit: typo | leogao wrote: | Handling heterogenous (and potentially untrustworthy) | systems also adds overhead, not to mention that buying | hardware in bulk is cheaper, so it makes the most sense | just to raise the money and buy the hardware. | mirekrusin wrote: | The problem is potentially solvable as generating | solutions takes a lot of GPU time and verifying it is | very fast. Aquiring input data may be a problem, but | should be possible with dedicated models for this type of | computation. | dmingod666 wrote: | With Open-AI being corporate controlled and not really 'Open'. Is | Neo a nod at 'The Matrix'? | [deleted] | habitue wrote: | Is it standard to prune these kinds of large language models once | they've been trained to speed them up? | dvfjsdhgfv wrote: | If they succeed, Eleuther should change their name to | ReallyOpenAI. | stellaathena wrote: | Or for extra irony, ClosedAI | techlatest_net wrote: | Is there any real justification behind this fear of close nature | of OpenAI or this is just frustration coming out? We had this | debate of closed Vs open source 20 years back and eventually | opensource won it because of various reasons. Won't those same | reasons apply to this situation of close nature of OpenAI? If so | then why are people worried about this? What is differnt this | time? | pmontra wrote: | The cost. | | Closed source and open source developers use the same | $300-3,000 laptops / desktops. Everybody can afford them. | | Training a large model in a reasonable time costs much more. | According to https://lambdalabs.com/blog/demystifying-gpt-3/ | the cost of training GPT-3 was $4.6 million. Multiply it by the | number of trial and errors. | | Of course we can't expect that something that costs tens or | hundreds of millions will be given away for free or to be able | to rebuild it without some collective training effort that | distributes the cost on at least thousands of volunteers. | qayxc wrote: | This. Plus the increasing amount of intransparent results. | Training data is private, so it's impossible to even try to | recreate results, validate methods, or find biases/failure | cases. | jne2356 wrote: | OpenAI only trained the full sized GPT-3 once. Hyperparameter | sweep was conducted on significantly smaller models (see: | https://arxiv.org/abs/2001.08361) | ttctciyf wrote: | I love the name's play on Greek _Eleutheria_ ( "eleutheria") - | freedom, liberty! | Havoc wrote: | Would be good if this could decentralized bittorrent/BOINC style | somehow. | | Wouldn't mind contributing some horsepower | jne2356 wrote: | They get this suggestion a lot. There's a section in their FAQ | that explains why it's infeasible. | | https://github.com/EleutherAI/info | onenightnine wrote: | this is beautiful. why not? maybe we can make something | eventually better than the now closed source version | Mizza wrote: | Serious question: is there a warez scene for trained models yet? | | (I don't know how the model is accessed - are users of mainline | GPT-3 given a .pb and a stack of NDAs, or do they have to access | it through access-controlled API?) | | Wherever data is desired by many but held by a few, a pirate crew | inevitably emerges. | jokowueu wrote: | I think this also might be an interest to you | | https://the-eye.eu/public/AI/pile_preliminary_components/ | MasterScrat wrote: | Those are datasets though, not models. | notretarded wrote: | Not really | Voloskaya wrote: | Checkpoint is not shared with customers, you only get access to | an API endpoint. | vessenes wrote: | GPT-3 users are given an API link which routes to Azure, full | blackbox. | exhilaration wrote: | It's via API https://openai.com/blog/openai-api/ | kordlessagain wrote: | The model is huge and is currently run in the cloud on many | machines. | mortehu wrote: | It's only 175 billion parameters, so presumably it can fit on | a single computer with 1024 GB RAM. | Voloskaya wrote: | On CPU the latency would be absolutly prohibitive to the | point of being useless. | typon wrote: | For training yes, but not for inference. | leogao wrote: | The inference latency would also be prohibitive. | kordlessagain wrote: | From 2019: https://heartbeat.fritz.ai/deep-learning-has- | a-size-problem-... | | > Earlier this year, researchers at NVIDIA announced | MegatronLM, a massive transformer model with 8.3 billion | parameters (24 times larger than BERT) | | > The parameters alone weigh in at just over 33 GB on | disk. Training the final model took 512 V100 GPUs running | continuously for 9.2 days. | | Running this model on a "regular" machine at some useful | rate is probably not possible at this time. | Voloskaya wrote: | Inference on GPU is already very slow on the full-scale | non-distilled model (in the 1-2 sec range iirc), on CPU | it would be an order of magnitude more. | stingraycharles wrote: | Wouldn't you need this model to be in GPU RAM instead of | regular RAM, though? ___________________________________________________________________ (page generated 2021-01-18 23:00 UTC)