[HN Gopher] OpenAI Codex ___________________________________________________________________ OpenAI Codex Author : e0m Score : 237 points Date : 2021-08-10 17:33 UTC (5 hours ago) (HTM) web link (openai.com) (TXT) w3m dump (openai.com) | throwaway128327 wrote: | I don't understand what is going on, why are people even spending | time on this? I think this and copilot and etc are solving a non | problem of "we will remove the boring part of programming" by | generating a bunch of code, so now it's even more boring to read | it and check if it actually does what you want. | | In the same time zero of the developers I interviewed know how a | linked list is laid out in memory, or what is the pro/con of | continuous memory layouts, or even how a cpu works actually. | | Maybe those things are not needed anymore, but I see their | code... I think it will be better if they know them. | parksy wrote: | This is just nascent technology leading toward something like | this: | | "Computer, I want to play a game." | | "Okay, what will the game be?" | | "I want to be a starship captain, give me a cool space ship I | can explore the galaxy with" | | "Okay... like this?" | | "Not quite, make the galaxy more realistic, with real stars and | planets. Also make it 3d. I want to be the captain inside the | ship." | | "How about now?" | | "Cool, and there should be space stations I can visit near | planets, and I can fly my ship to stars with hyperspace. Make | it so I have to trade for fuel at the space stations, maybe I | need to mine asteroids or search derelict space ships for | treasure. I want to play with my friends too, they can have | their own ships or walk around my ship." | | "Done, was there anything else?" | | "Yes, add different alien races to some of the star systems, | and make some of them have alliances. I want to talk to the | aliens about their history and culture. Sometimes aliens are | unfriendly and we'll have space battles if talking doesn't | work. Make it so I can command a fleet and call for | reinforcements." | | "Processing... Done. Anything else?" | | "Actually this is boring, can we start over?" | | "Game erased. Please provide new prompt." | vimy wrote: | Also know as the holodeck from Star Trek. | throwaway128327 wrote: | Oh! this will be so cool! do you really think it could lead | in that direction? To me it seems more like a metaphysical | cargo cult. I think I am too pessimistic, I should shake it | off, nothing good comes out of being pessimistic (by | definition). | | Thanks for the inspiration! | parksy wrote: | > do you really think it could lead in that direction? | | If you asked me 20 years ago, or even 10, I'd have said it | was total science fiction. I wouldn't have been able to | imagine how to do it. If you asked me 5 years ago, I'd have | vaguely said something about AI, half jokingly. At the time | I thought perhaps the models could be trained so we can do | test-only development and let AI trained on formal test | cases generate endless code until all tests pass, but I | didn't really imagine it would be possible to get a | computer to take freeform written English (even in a | tightly controlled manner) and produce functioning code. | | Over the past couple of years I have seen increasingly | fluent demonstrations and tried a few myself, and I have | fallen off the fence and I think that with the pace that | machine learning and AI assisted programming keeps | advancing, this outcome is all but inevitable, as far | fetched as it seems. | | I was messing with the OpenAI sandbox over the weekend and | it helped me generate several game design concepts from | prompts similar to my post above that I could see myself | being interested in building and playing. It's not | difficult to imagine down the line with a few more | advancements in this tech that the generated design could | then instruct the code generator, fetch the assets, and | stage the environment for a player or user to enter without | ever touching a line of code. | | I'm not close enough to the research itself know which of | those problems are hard and which are easy, so I don't know | if we'll see the first totally AI-generated "proto- | holodeck" tech demo in the next 5 years, or the next 20 | years, but I can't see it being more than 50 years away, | and something tells me with the pace of things it will be | much sooner than that, assuming we're all still around at | the time to enjoy it. | throwaway128327 wrote: | I wonder what will it make when you ask it to make a good | bot AI for a game. | | "make a game with a formidable opponent that plays good | enough to win with 51% probability" | | and of course the inevitable "make a better version of | yourself" | parksy wrote: | From what I've seen the technology can fuse together a | remarkable range of outputs, but all of them are | essentially fused together from within the training set. | If there were enough examples of AI opponents, it | conceivably could do it since most game AIs are some form | of state machine combined with a degree of statistical | analysis and pathfinding (for mobile AI actors). It would | "just" be replicating existing patterns. | | As I understand it, it would take a dramatic leap from | this kind of interpolation to being able to extrapolate | and "self improve". So far I haven't seen anything that | convinces me we're close to this, but again I'm not close | to the wheel on the research side of things. | woah wrote: | You're interviewing programmers for a job in operating systems | programming? | throwaway128327 wrote: | Just full stack devs react native + go. Is it too wrong to | think they are the same? Programming is programming, most | computers work in a similar way no? | | But they also don't know how garbage collection works in | their language, or how to work with 1 million things in an | efficient manner. Or why does the app pause for 100 ms | because someone does sort while parsing dates within the | sort. | | For example, I have seen people that cant imagine what is the | cost of a leaked database transaction, just back of the | napkin wise, like you would think well, how many changes | happened in between, how much we have to unwind when the | session disconnects, when will it even disconnect because of | the connection pool.. etc etc. Because the sql server is this | magic rds thing. As if aws will solve everything with its | pixie dust. | reducesuffering wrote: | Think bigger. Say I'm starting a startup: | | 1. "Setup Django, Nginx, and Postgres deployed on a Digital | Ocean Ubuntu droplet." Done. | | 2. "Make a shopping page like $URL." Done. | | 3. "Fill it with data from X and connect with Stripe." Done. | | 4. ??? | | 5. Profit | | Seems like even a great dev will take 20x the time to do that | if the model is able to correctly generate this, even with an | error, customization, or two. | throwaway128327 wrote: | but does it really matter, if 20x is 1 week instead of 2 | hours? | | are startups really that shallow? | motoxpro wrote: | 1/20th of the time? That's kind of a big deal. | qayxc wrote: | That depends: https://xkcd.com/1205/ | | A one-time setup is perfectly OK to take a few days, | especially if afterwards you have a documented process | that allows you to modify and improve the result. | throwaway128327 wrote: | i think the 1/20th of the time was mentioned was only at | start, i dont think you will gain a lot after that as the | spaghetti AI will come to collect. | | You have a debt to pay. -- Davy Jones | dimal wrote: | If you don't have someone that understands the generated | code, you'll be kinda screwed. Most of my work isn't writing | a function to do X. It's reading and understanding all the | surrounding code and architecture and then knowing that I | need a function to do X. Writing the actual function isn't | usually much of a challenge. I get the feeling that this tool | will just encourage write-only code that ultimately no one | understands. Will all of the generated code follow a | consistent style? Will it know to use the framework you built | or will it just reinvent everything it needs for each problem | you give it? I already see tons of code that people copy and | paste without really understanding it, and a lot of the time | they're just adding complexity by solving non-problems. This | just automates that process. I can see it being useful in | certain narrow cases, but the potential for misuse is huge. | holler wrote: | at the point where 1/2/3 are possible, what value does the | startup have when anyone else can ask it to do the same | thing? | tux3 wrote: | Do your competitors have access to this tool that gets you | started 20x faster? If so, you want the tool. | | Your copycat startup may not have incredible value, but | selling shovels always pays. | tome wrote: | Why would you mention "Django", "Nginx", "Postgres", "Digital | Ocean", "Ubuntu" or "Stripe"? Surely those are implementation | details that the user wouldn't care about. | [deleted] | nradov wrote: | It seems like they're going in totally the wrong direction. If | program content is predictable based on patterns (low entropy) | then that's a sign that our programming languages are too low | level. If we want to improve developer productivity then the | solution is the same as it always has been: create higher level | languages which abstract away all the repetitive patterns. | mxwsn wrote: | Tools are relatively low level compared to any single use | case or field because they should universally support all | uses cases or fields. The more narrow your field or use case | is, the fewer resources there are to create a higher level | language that abstracts away the details that aren't | important for your area, but are important to other areas. In | this manner, Codex has enormous potential. | temp8964 wrote: | Can this read existing code and fix one missing piece? That will | be cool. | | Say I have a question I can't solve by searching through | stackoverflow. If the AI can solve a problem like that, it will | be great. | priyanmuthu wrote: | Program Synthesis can do some rudimentary fixes. But I would | love to explore this problem of program correction using AI. | maxwells-daemon wrote: | The "language models don't really understand anything" corner is | getting smaller and smaller. In the last few months we've seen | pretty definitive evidence that transformers can recombine | concepts ([1], [2]) and do simple logical inference using | contextual information ([3], "make the score font color | visible"). I see no reason that this technology couldn't smoothly | scale into human-level intelligence, yet lots of people seem to | think it'll require a step change or is impossible. | | That being said, robust systematic generalization is still a hard | problem. But "achieve symbol grounding through tons of multimodal | data" is looking more and more like the answer. | | [1] https://openai.com/blog/dall-e/ [2] | https://distill.pub/2021/multimodal-neurons/ [3] | https://openai.com/blog/openai-codex/ | fpgaminer wrote: | > "language models don't really understand anything" | | I have a sneaking suspicion that, if blinded, the crowd of | people saying variations of that quote would also identify the | vast majority of human speech as regurgitated ideas as well. | | > I see no reason that this technology couldn't smoothly scale | into human-level intelligence | | Yup, the OpenAI scaling paper makes this abundantly clear. | There is currently no end in sight for the size that we can | scale GPT to. We can literally just throw compute at the | problem and GPT will get smarter. That's never been seen before | in ML. Last time I ran the calculations I estimated that, | everything else being equal, we'd reach GPT-human in 20 years | (GPT with similar parameter scale as a human brain). That's | everything else being equal. It is more than likely that in the | next twenty years innovation will make GPT and the platforms we | use to train and run models like it more efficient. | | And the truly terrifying thing is that, to me, GPT-3 has about | the intelligence of a bug. Yet it's a bug who's whole existence | is human language. It doesn't have to dedicate brain power to | spatial awareness, navigation, its body, handling sensory | input, etc. GPT-human will be an intelligence with the size of | a human brain, but who's sole purpose is understanding human | language. And it's been to every library to read every book | ever written. In every language. Whatever failings GPT may have | at that point, it will be more than capable of compensating for | in sheer parameter count, and leaning on the ability to combine | ideas across the _entire_ human corpus. | | All available through an API. | maxwells-daemon wrote: | As an add-on to this: I'd encourage anyone interested in this | debate to read Rich Sutton's "The Bitter Lesson" | (http://www.incompleteideas.net/IncIdeas/BitterLesson.html). | | At every point in time, the best systems we can build today | will be ones leveraging lots of domain-specific information. | But the systems that will continue to be useful in five years | will always be the ones freely that scale with increased | parallel compute and data, which grow much faster than domain- | specific knowledge. Learning systems with the ability to use | context to develop domain-specific knowledge "on their own" are | the only way to ride the wave of this computational bounty. | pchiusano wrote: | https://rodneybrooks.com/a-better-lesson/ is an interesting | retort to the Sutton post. | Voloskaya wrote: | The definition of "understanding" behaves just like the | definition of "intelligence": The threshold to qualify gets | pushed by as much as the technology progresses, so that nothing | we create is ever intelligent and nothing ever understands. | karmasimida wrote: | > The "language models don't really understand anything" | | This is still true. By all account, human doesn't need to read | 159GB of Python code to write Python, or we simply can't. | | But it doesn't necessarily indicate language models aren't | useful. | hackinthebochs wrote: | Considering the sum total of data and computation that goes | in to creating an intelligent human mind, including the | forces of natural selection in creating our innate structure | and dispositions, it's not obvious that any conclusions can | be drawn from the fact that so much data and compute goes | into training these models. | nightski wrote: | Has this transfer of knowledge from one domain to another | really been demonstrated by these models/learning | processes? I know transfer learning is a thing (I have a | couple books on my shelf on it). But it seems far from what | you are describing. | talor_a wrote: | they mention in the demo video that the inspiration for | codex came from GPT-3 users training it to respond to | queries with code samples. I saw some pretty impressive | demos of the original model creating SQL queries from | plain questions. I'm not sure if that counts as switching | domains, but it's something? | visarga wrote: | DALL-E + CLIP models show a deep understanding of the | relation between images and text. | sbierwagen wrote: | The AlphaZero algorithm swapped between board games | pretty easily. OpenAI could also have been gesturing at | this when they named the GPT paper "Language Models are | Few-Shot Learners". | maxwells-daemon wrote: | I would argue humans ingest a lot more than 159GB before they | can write code. Most of it isn't Python, and humans currently | transfer knowledge a lot more efficiently than NNs, but I | suspect that'll change as incorporating more varied data | sources becomes feasible. | bufferoverflow wrote: | It probably can scale, but we're nowhere near the computational | power we need to even recreate the brain. And don't forget, our | brain took a billion years to evolve. | | A typical brain has 80-90 billion neurons and 125 trillion | synapses. That's a big freaking network to train. | | Hopefully we can figure out how to train parts of it and then | assemble something very smart. | jacquesm wrote: | Takes on average 2.5 decades to train it. | mattkrause wrote: | That's just from the most recent checkpoint :-) | | If you were to build it "from scratch" you'd also need to | include the millions of years of (distributed) evolution | required to get that particular kid to that point. | | Tony Zador has some interesting thoughts about that, | including"A critique of pure learning", here: | https://www.nature.com/articles/s41467-019-11786-6) | jdonaldson wrote: | I think intelligence as defined as "mapping inputs into goal | states" is pretty well handled by models, and the models may be | able to pick and choose states that are sufficient for | achieving the goals. | | However, the intelligence that's created by language models is | very schizophrenic, and the human-level reflective intelligence | that it displays is at best a bit of Frankenstein's monster (an | agglomeration of utterances from other people that it uses to | form sentences that form opinions of itself or its world). | | I think that modeling will help us learn more about human | intelligence, but we're going to have to do a lot better than | just training models blindly on huge amounts of text. | visarga wrote: | Maybe we're also >50% Frankenstein monsters, an agglomeration | of utterances from other people. | 6gvONxR4sf7o wrote: | > The "language models don't really understand anything" corner | is getting smaller and smaller. | | In my mind, understanding a thing means you can justify an | answer. Like a student showing their work and being able to | defend it. An answer with a proof understands the answer with | respect to the proof it provides. E.g. to understand an answer | with regards to first order logic, it'll have to be able to | defend a logical deduction of that answer. | | These models still can't justify their answers very well, so | I'd say they're accurate but only understand with respect to a | fairly dumb proof system (e.g. they can select relevant | passages or just appeal to overall accuracy statistics). | They're still far from being able to justify answers in the | various ways we do, which I'd say means that by definition that | they still don't understand with regards to the "proof systems" | that we understand things with regards to. | | Maybe the next step will require increasingly interesting | justification systems. | beering wrote: | > In my mind, understanding a thing means you can justify an | answer. | | What if the language model can generate a step-by-step | explanation in the form of text? [0] | | There's no guarantee that the reasoning was used to come up | with the answer in the first place, and no proof that the | reasoning isn't just the product of "a really fancy markov | chain generator", but would you accept it? | | We're really walking into Searle's Chinese Room at this | point. | | [0] https://nitter.hu/kleptid/status/1284069270603866113#m | sbierwagen wrote: | >In my mind, understanding a thing means you can justify an | answer. | | Sure, but how does that work with superhuman AI? Consider | some kind of math bot that proves theorems about formal | systems which are just flat out too large to fit into human | working memory. Even if it could explain its answers, there | would just be too many moving parts to keep in your head at | once. | | We already see something this in quant funds. The stock | trading robot finds a price signal, and trades on it. You can | look at it, but it's nonsensical: if rainfall in the Amazon | basin is above this amount, and cobalt price is below this | amount, then buy municipal bonds in Topeka. The price signal | is durable and casual. If you could hold the entire global | economy in your head, you could see the chain of actions that | produce the effect, but your brain isn't that big. | | Or you just take it on faith. Why do bond prices in Topeka go | up, but not in Wichita? "It just does." Okay, then what was | the point of the explanation? A machine can't justify | something you physically don't have enough neurons to | comprehend. | gnramires wrote: | > Even if it could explain its answers, there would just be | too many moving parts to keep in your head at once. | | While this is possible in practice, consider the | (universal) Turing machine principle: in principle, you can | simulate any system given enough memory; we may not have it | our brains, but we have pen and paper or simply digital | text scratchpad (both of which we use extensively in our | lives). | gnramires wrote: | Also, you should note the memory and capabilities required | to reach a conclusion might be much greater than to show | it's true. Showing a needle may be easy, finding it in the | haystack very hard. In this sense the hope for | explainability is expanded. But still, I guess the real | world is really messy "the full explanation" may be too | large -- like when you explain a human intuition, the "full | explanation" might have been your entire brain, your entire | set of experiences up to that point; yet we can give | partial explanations that should be satisfactory | | A have a hypothesis that inevitably, reasoning needs to | 'funnel' through explicit, logical representations (like we | do with mathematics, language, etc.) to occur effectively. | Or at least (quasi-)formalization is an important element | of reasoning. This formal subset can be communicated. | 6gvONxR4sf7o wrote: | It's not about us being able to interpret answer or | justification, but the reasoner's ability to justify. If a | superhuman AI can justify its answers in terms of first | order logic, for example, it could be defined as | understanding the answers with respect to FOL. Whether we | as humans are able to check whether this specific bot in | fact meets that definition is a separate empirical | question. | | If that quant algo you mentioned just says "it'll go up | tomorrow" that's different than "it'll go up tomorrow" with | an attached "it's positively correlated with Y, which is up | today" which is different from a full causal DAG model of | the world attached, which is again different from those | same things expressible in english. But again, those are | definitions, which are separate from our ability to check | whether they're met. | | Luckily, we're not in the realm of bots spitting out | unfeasible to check proofs, except for a few niche areas | like theorem proving (e.g. four color theorem). For | language models like in the article, the best I'm aware of | is finding relevant passages to an answer and classifying | entailments. | | > A machine can't justify something you physically don't | have enough neurons to comprehend. | | We can't always verify its justification, but it either can | or can't justify an answer with respect to a given | justification system. | cscurmudgeon wrote: | We build another system we fully understand that can | process the justification and see if it is correct/makes | sense. | joshjdr wrote: | I found it on Stack Overflow! | visarga wrote: | > Maybe the next step will require increasingly interesting | justification systems. | | You can just ask it to comment what it intends to do. It's | surprising actually. | maxwells-daemon wrote: | Look at the "math test" video. | | Given the question: "Jane has 9 balloons. 6 are green and the | rest are blue. How many balloons are blue?" The model | outputs: "jane_balloons = 9; green_balloons = 6; | blue_balloons = jane_balloons - green_balloons; | print(blue_balloons)" | | That seems like a good justification of a (very simple) step- | by-step reasoning process! | wizzwizz4 wrote: | Except I could do that with a few regex substitutions, | which would not be reasoning. The "intelligence" is in the | templates provided by the training data. (Extracting that | is _impressive_ , but not _that_ impressive.) | lstmemery wrote: | I have to disagree with you here. In the Codex paper[1], they | have two datasets that Codex got correct about 3% of the time. | These are interview and code competition questions. From the | paper: | | "Indeed, a strong student who completes an introductory | computer science course is expected to be able to solve a | larger fraction of problems than Codex-12B." | | This suggests to me that Codex really doesn't understand | anything about the language beyond syntax. I have no doubt that | future systems will improve on this benchmark, but they will | likely take advantage of the AST and could use unit tests in a | RL-like reward function. | | [1] https://arxiv.org/abs/2107.03374 | nmca wrote: | 12B, though. What about 1.2T? | lstmemery wrote: | You need to scale the amount of data to take advantage of | the increase in parameters. I'm not sure where we would | find another 100 GitHubs worth of data. | ruuda wrote: | > but they will likely take advantage of the AST | | In the end, a more general approach with more compute, always | wins over applying domain knowledge like taking advantage of | the AST. This is called "the bitter lesson". | http://www.incompleteideas.net/IncIdeas/BitterLesson.html | lstmemery wrote: | I don't think the bitter lesson is applies to ASTs. | | From the Bitter Lesson: | | "Early methods conceived of vision as searching for edges, | or generalized cylinders, or in terms of SIFT features. But | today all this is discarded. Modern deep-learning neural | networks use only the notions of convolution and certain | kinds of invariances, and perform much better." | | Those models are taking advantage of inductive biases. | Every model has them, including the massive language | models. They are not the same as engineered features (such | as SIFTs) or heuristics. | | Using the AST is just another way of looking at the code | already in your dataset. For the model to understand what | it is writing, it needs to map the text sequences map to | ASTs anyways. It can attempt to learn this, but the 12B | model still makes illegal Python code so it clearly hasn't. | kevinqi wrote: | "the bitter lesson" is a very interesting, thank you! | However, I wonder if AST vs. text analysis is fully | comparable to the examples given in the post. Applying | human concepts for chess, go, image processing, etc. failed | over statistical methods, but I don't think AST vs. text is | exactly the same argument. IMO, using an AST is simply a | more accurate representation of a program and doesn't | necessarily imply an attempt to bring in human | intuition/concepts. | abeppu wrote: | I'm still surprised by the approach. I mean, great that it works | this well -- but program synthesis is one of those rare domains | where you can observe exactly what the outcome is after you | generate something. You can see execution traces, variable | values, what the JIT produced, etc. And all of this is relatively | cheap -- often executing a code snippet should be far cheaper | than an extra pass through a giant DNN right? So it's fascinating | to me that they train entirely from dealing with code as text. | | Imagine learning to develop recipes, not by ever cooking or | eating or even seeing food, but only reading a giant library of | cookbooks. Or learning to compose music but never hearing or | playing anything -- only seeing scores. | wantsanagent wrote: | FWIW execution guided code synthesis is a thing. Get a few | possible outputs and ditch those that don't pass a parser as an | example. At least in the SQL generation realm this is well | worth the time it takes to tack onto a large language model. | [deleted] | jmportilla wrote: | Very cool, will be interesting to see if this is ever added in to | VisualStudio as some sort of "super" auto-complete. | mensetmanusman wrote: | If this actually worked, wouldn't that be amazing? If you could | break down a software idea into a blue print of concepts that | need to be accomplished, and then dictate what should be done... | | I doubt it works, but I wonder how many decades from now we will | be able to walk through a finite number of simple requests and | wrap them together as working software. Then people will be able | to convert their blueprint into action! | GistNoesis wrote: | Can I use this to write solidity contracts ? | mxwsn wrote: | That has got to be one of the worst possible use cases one | could imagine. In page 33 of the appendix, the authors note | that nearly 40% of RSA encryption keys created by Codex are | clearly insecure. | GistNoesis wrote: | Only if tokens have value. | | If codex is able to handle a generic api from reading the | doc, it maybe could use a python library for solidity | contracts like | https://web3py.readthedocs.io/en/stable/contracts.html | | As a contract user, I'd probably have more trust in a | contract written by an independent AI from a short natural | language specification which can't hide intent, than a | contract with hidden backdoor, or a subtle bug. | | Also the AI will probably improve with usage. | | You probably can generate multiple version of your contract, | and maybe a high level bug correction scheme like taking the | median action between those version can increase bug | robustness and find those edge cases when action differ. | woah wrote: | What does that have to do with anything? | northfoxz wrote: | A new way to talk to the computer I guess. | vincnetas wrote: | Will really be impressed when one could say: "here is this | codebase, modify this function so that it would preduce [insert | desired efect]" and also other functionality of project would not | crash thumbling down... | | Because writing code from scratch now is i think much rearer than | improoving existing codebases. Aka bugfixing. | vincnetas wrote: | Also curious what this ai would produce when provided with | contradictory requests. Because often there are multiple | requirements which on theyr own sounds reasonable but when you | try to fit all requirements in one system, things get nasty. | polyanos wrote: | It is only able to translate small instructions into code. I | think it will take a while to get to a situation where you | can just give it a list of requirements and it spits a | working program. | | Hell it messed up when they gave it the instruction "make | every fifth line bold" in their Word api part of the demo, | where it made the first line of every paragraph (which is | only 4 lines long in total) bold instead of every fifth line. | 3wolf wrote: | I think integrations like the MS Word example they show off at | the end of the live demo have the potential to be even more | impactful than just generating code for programmers. | polyanos wrote: | That still needs work though, it messed up the "Make every | fifth line bold" pretty bad. Still, it showed it could adapt to | a new API pretty well. | 3wolf wrote: | Yeah, definitely. I guess my point was that converting | natural language to source code can be even more valuable for | people who don't know how to code, but want to perform | actions more complicated than a simple button press. For | example, I often find myself doing regex based find-and- | replace-alls in text files, and even that feels inefficient | while also being over the head of the vast majority of users. | I'd imagine there are a lot of people out there spending many | hours manually editing documents and spreadsheets. | amalive wrote: | Would like to say "Fix that something of undefined error" some | day. | dmurray wrote: | They should have released this first instead of GitHub Copilot. | The focus would then have been much more on "look at the cool | stuff they can do" rather than "Microsoft is releasing a product | that plagiarizes GPL code". | | Once people had digested that and there had been a few other | proof-of-concept business ideas around turning Codex into a SaaS | (because some people will always queue to build their product on | your API), announce the evil version. Not that I really think | Copilot is evil, but the IP concerns are legitimate. | mark_l_watson wrote: | I watched their 30 minute demo on Twitch this morning, really | good! | | I use their OpenAI beta APIs as a paying customer, I am still | waiting for access to Codex. | leesec wrote: | The Writing On The Wall | z77dj3kl wrote: | I thought OpenAI was originally supposed to be some kind of for- | the-good, non-profit institution studying AI and its safe use in | particular with an effort to make it more accessible and | available to all through more open collaboration. This is cool | research, sure; but what happened to making models available for | use by others instead of just through some opaque APIs? | | Maybe I'm just remembering wrong or conflating OpenAI with some | other entity? Or maybe I bought too much of the marketing early | on. | mark_l_watson wrote: | They very transparently transitioned to a for profit company. | It doesn't seem like they are aggressively profit oriented | though: I am a paying customer of OpenAI beta APIs and the cost | to use the service is very low. It also solves several classes | of tough NLP problems. I used to sell my own commercial NLP | library - glad I gave up on the years ago. | keewee7 wrote: | OpenAI was founded in 2015. In 2015 Google was AI and AI was | Google. There was legitimate concern that one American | corporation was going to dominate AI. OpenAI was created to | challenge that dominance and let "AI benefit all of humanity". | | In the meantime China and Chinese companies have catched up. | Turns out the fear that one company and one country dominating | AI was overblown. | | Maybe the OpenAI founders feel that the original goal has been | fulfilled because AI is no longer dominated by the US and | Google. | Buttons840 wrote: | No, they did some good, they've done a few things to personally | help me. They created OpenAI Gym which is a great help when | doing reinforcement learning research and defined the standard | interface for reinforcement learning libraries for a | generation. But they not longer maintain OpenAI Gym. | | They also created Spinning Up [0], one of the best resources | I've found for learning reinforcement learning. Their teaching | resources are detailed but relatively brief and are focused on | implementing the algorithms, even if some of the "proofs" are | neglected. But they no longer maintain Spinning Up. | | So yes, originally they were for-the-good, but lately I've | noticed them moving away from that in more ways than one. It | seems they learned one cool trick with language sequence | modelling, and they have a lot of compute, and this is all they | do now. | | [0]: https://spinningup.openai.com/en/latest/ | blt wrote: | That was the marketing message. They became for-profit in 2019 | and took investment from Microsoft. Many people were skeptical | before that because the main investors were mostly known for | for-profit ventures. | webmaven wrote: | You're remembering correctly. OpenAI transitioned from non- | profit to for-profit in 2019, took about $1 billion from | Microsoft (there has been speculation that this was mostly in | the form of Azure credits), and announced that Microsoft would | be their preferred partner for commercializing OpenAI | technologies: https://openai.com/blog/microsoft/ | stingraycharles wrote: | I remember Sam Altman, when asked "How will you make money?", | reply they would ask the AI. I thought it was a fairly creative | answer. | | It turns out, however, that the way they plan on earning money | is much less creative, and more run-of-the-mill SaaS | monetization. In a way, I like to believe that a real AI would | also end up with such a mundane strategy, as it's the most | likely to actually make them profitable and return money to | investors. | amrrs wrote: | I feel that OpenAI Codex could become like Webflow for coding. It | might sound ironic, but what tools like Webflow in the world of | Web programming is to give the power of creators to build | something fast that can long last (without the speciality of a | decent web programmer). | | If the same thing can happen in the world of programming, I guess | evaluations like LeetCode and Whiteboarding can go away and bring | in a new of logical thinking evaluation which could ultimately be | a more realistic method of some programmer rising above the | chain. | Vermeulen wrote: | A warning to devs building on OpenAI APIs: We spent months | developing a chatbot using GPT3 for our game and released a video | showcasing it: https://www.youtube.com/watch?v=nnuSQvoroJo&t=264s | | Afterwards OpenAI then added GPT3 chatbot guidelines disallowing | basically anything like this. We were in communication with them | beforehand, but they decided later that any sort of free form | chatbot was dangerous. | | What they allow changes on a weekly basis, and is different for | each customer. I don't understand how they expect companies to | rely on them | nradov wrote: | The notion of a toy like a chatbot being "dangerous" is just so | ludicrous. The OpenAI folks take themselves way too seriously. | Their technology is cool and scientifically interesting, but in | the end it's nothing more than a clever parlor trick. | mszcz wrote: | I think different kind of dangerous, not the SkyNet stuff. | The first idea that popped into my mind is below. I know, | it's dark but... | | 8 year old to AI: "my parents won't let me watch TV, what do | I do?". AI: "stab them, they'll be too busy to forbid you". | | Then again the same thing can be said by a non-AI. My | thinking is that you'd be talking to an _actual average_ | person. I 'm not so sure that that is such a good thing. | EamonnMR wrote: | Definitely dangerous from a legal perspective if AI Dungeon | is any indication. | elefanten wrote: | The general public basically races to test the most | controversial content. As exhibited by several other high- | profile chatbot launches. | | > Tay responded to a question on "Did the Holocaust | happen?" with "It was made up" | | https://en.m.wikipedia.org/wiki/Tay_(bot) | aeternum wrote: | It's pretty easy to get GPT-3 to say things that are | incredibly sexist and racist. I think OpenAI is more | concerned about the bad press associated with that than AI- | safety. | Siira wrote: | Which is even less ethically defensible. | andreyk wrote: | Oh man, I was looking forward to this a ton! Are you thinking | to keep working on it with the open source GPT J or something | similar by any chance? | Vermeulen wrote: | I am looking at GPTJ, and also hoping OpenAI comes to their | senses on how dangerous a video game chatbot can be | MasterScrat wrote: | > Afterwards OpenAI then added GPT3 chatbot guidelines | disallowing basically anything like this. We were in | communication with them beforehand, but they decided later that | any sort of free form chatbot was dangerous. | | Was this announced anywhere? We applied to deploy an | application in this space, and they refused without providing | any context, so I'd be really interested if they published | details about restrictions in this space somewhere. | Vermeulen wrote: | https://beta.openai.com/docs/use-case-guidelines/use-case- | re... "reliably being able to limit the conversational topics | to strictly X, Y, and Z topics" | Miraste wrote: | OpenAI cloaks themselves in false "open" terminology to hide | how proprietary and incredibly restrictive they've made their | tech. That's a very cool demo; have you considered trying to | make it run on GPT-J instead? It's an open source alternative | you can run yourself or pay an independent api provider without | supporting OpenAI. | Vermeulen wrote: | Haven't been able to find a GPT-J service with good latency - | though we haven't tried hosting ourselves | spullara wrote: | I have gotten it running on AWS in a container if you want | the Dockerfile/scripts I can send it to you. Email is in my | profile. | fpgaminer wrote: | It sucks that OpenAI has no competition right now. They have | every right to control their technology however they like. But | it's a shame that they're being so stifling with that right, | killing really fun stuff like you demonstrated. | | But that monopoly won't last, and I think it's more than likely | that competition will crop up within the next year. There's | definitely a lot of secret sauce to getting a 175B parameter | model trained and working the way OpenAI has. The people | working there are geniuses. But it can still be reproduced, and | will. Once competition arrives I'm hoping we'll see these | shackles disappear and see the price drop as well. Meanwhile | the open source alternatives will get better. We already have | open source 6B models. A 60B model shouldn't be far off, and is | likely to give us 90% of GPT-3. | option_greek wrote: | That's a really interesting demo. What makes the responses so | laggy? Does the model take that long to generate text? You can | also experiment with things like repeating the user question or | adding pauses like "hmm let's see" to make it less noticeable | at least some of the time. | | Too bad they asked you to pull it. What's the danger they are | worried about? Annoying thing from their press releases is how | seriously they take their GPT3 bot impact on humans. Despite | all the hype, it's difficult to see the end of humanity by GPT3 | bots any time soon. Honestly they need to rename themselves - | can't see what's open about openai. | maxwells-daemon wrote: | Autoregressive transformers take a while to generate text, | since you need to run the whole model once for every word in | the output. | Vermeulen wrote: | It's laggy since it needs to do speech to text, gpt3 text | response, then text to speech. Not sure what adds the most | latency actually. | | They only allow gpt3 chatbots if the chatbot is designed to | speak only about a specific subject, and literally never says | anything bad/negative (and we have to keep logs to make sure | this is the case). Which is insane. Their reasoning to me was | literally a 'what if' the chatbot "advised on who to vote for | in the election". As if a chatbot in the context of a video | game saying who to vote for was somehow dangerous | | I understand the need to keep GPT3 private. There is a lot of | possibility for deception using it. But they are so scared of | their chatbot saying a bad thing and the PR around that | they've removed the possibility of doing anything useful with | it. They need to take context more into account - a clearly | labeled chatbot in a video game is different than a Twitter | bot | dfraser992 wrote: | But what if it wasn't clearly labeled? I did my MSc thesis | on fake reviews and discussed the phenomena known as | "covert marketing" a bit. e.g. a guy you're talking to in a | bar at some point steers the conversation to the excellent | beer he is drinking and heavily recommends it to you. Good | enough actors will be very convincing. "Influencers" are a | somewhat more ethical alternative that takes advantage of | humans' lemming-like nature. | | I mean, quite a lot of people truly believe Hilary Clinton | is the mastermind behind a DNC run pedophile ring. Yes, she | is a problem, but that theory is completely schizophrenic. | A NPC masquerading as a real person who spouts positive | talking points about Tucker Carlson's respect for Hungary | is quite reasonable compared to that and it will suck some | people in. | | So all it takes is some right wing developers for a not- | entirely-just-a-game like Second Life or Minecraft to | introduce a bug that allows certain instances of NPC to be | unlabeled... or a mod to a game that drives a NPC... and an | equivalent to GPT-3 funded by the Kochs or the Mercers... | | Very hypothetical, very hand waving. But it is possible. So | I can see the PR and legal departments flat out stopping | this idea. | minimaxir wrote: | > But they are so scared of their chatbot saying a bad | thing and the PR around that they've removed the | possibility of doing anything useful with it. | | It's not unreasonable to have checks-and-balances on AI | content, and there should be. | | However, in my testing of GPT-3's content filter when it | was released (it could be improved now), it was _very_ | sensitive to the point that it had tons of false positives. | Given that passing content filter checks is required for | productionizing a GPT-3 app, it makes using the API too | risky to use, and part of the reason I 'm researching more | with train-your-own GPT models. | nradov wrote: | Why should there be checks and balances on AI content? | What most people label as "AI" today is literally just | fancy statistics. Should there be checks and balances on | the use of linear regression analysis and other | statistical techniques? Where do we draw the line? | minimaxir wrote: | > Should there be checks and balances on the use of | linear regression analysis and other statistical | techniques? | | That rhetorical question actually argues against your | point: even in academic contexts, statistics can be used | (intentionally or otherwise) to argue | incorrect/misleading points, which is why reputable | institutions have peer reviews/boards as a level of | validation for papers. | | The point I was making was more on general content | moderation in response to user-generated content, which | is _required_ for every service that does so for legal | reasons at minimum, as they 're the ones who will get | blamed if something goes wrong. | mola wrote: | Ofcourse statistical techniques need checks and balances, | hence peer reviewed academic papers, meta analysis, etc. | statistics is a major tool for science these days. | science needs checks and balances otherwise it's a pretty | idle effort. Without checks and balances, you could just | imagine any theory and believe it's the truth because you | want to. | ummonk wrote: | Eh, I could still see a clearly labeled chatbot on a video | game causing a major PR scandal if it says something | offensive. Not really worth the risk. | | Pretty bad that they took so long to decide on this, | though, pulling out the rug from under developers' feet. | qwertox wrote: | This stunning. Imagine being able to practice your foreign | language lessons this way. | TchoBeer wrote: | How many languages does GPT3 support at the moment? | make3 wrote: | I work in this domain, and you can make these things say | anything with a little probing, even stuff like "Hitler was | right to kill all the Jews, I wish he was still alive today." | | They likely don't want to have "OpenAI GPT-3" and such stuff | associated to one another in such demos, would be really bad | for their appearence. | refulgentis wrote: | I'm trying to extract some signal from this link...lots of | upvotes, no comments, 30 min old, top 3 on HN...I'm worried this | will be read as negative, but it's not, just learning, and enough | time has passed I'm itching to jump in and ask: | | - Is the significance here exactly what it says on the tin: the | model behind GitHub's AI code completion will be shared with | people on an invite basis? Or am I missing something? | | - What is the practical import of the quote at the end of this | comment? | | "can now" makes me think its a new feature over Github's | implementation, which would then indicate the "simple commands" | could be general UI, or at least IDE UI, navigation. | | If "can now" means "it is currently capable of, but will be | capable of more", then I'd expect it to be the same as the | current implementation on Github. | | Quote: "Codex can now interpret simple commands in natural | language and execute them on the user's behalf--making it | possible to build a natural language interface to existing | applications." | sbierwagen wrote: | Take a look at the video demo. It takes natural text in a box | and generates code. Copilot was super-autocomplete, so the | interface was writing code in an IDE that it filled out for | you. Natural language interface will be a little easier for | non-programmers. (Though, how would you read the code to make | sure it does what you meant...) | polyanos wrote: | >Take a look at the video demo. It takes natural text in a | box and generates code. Copilot was super-autocomplete, so | the interface was writing code in an IDE that it filled out | for you. | | No it wasn't, you can literally describe, in natural text, | what you want in a comment and CoPilot will do its best to | generate a complete method based on that comment. It seemed | like it was so auto-compltely because that focussed on the | "helping the developer" part. | | I'm fairly sure CoPilot could have shown something similar if | they had a demo where you could make something visual easily, | like HTML + Javascript/Typescript/whatever scripting | language. They're using exactly the same model (Codex) after | all. | am17an wrote: | I really want to just play with this tech- it's frightening but | also the future, but I'm still waiting to be accepted on the | GitHub copilot waitlist. I wonder how long this will take for | people who don't know someone who knows someone... | [deleted] | febrilian wrote: | Uhh... I'm literally no one but got the access for like a week | or so. I got 134 repos and 12,060 contributions in the last | year. Idk if that mattered. | andyxor wrote: | that's not the future, these large language models have no | understanding of language, they repeat the most frequently | occurring patterns like parrots. They miss this whole thing | called semantics. | f0e4c2f7 wrote: | They just finished a demo on twitch. Pretty crazy! | | https://www.twitch.tv/videos/1114111652 | | Starts at 15:45. | j0ej0ej0e wrote: | aaaand they've blocked audio until 18:17ish, timestamp url: | https://www.twitch.tv/videos/1114111652?t=00h18m17s | raidicy wrote: | lmao; copyright muted so you can't even hear them speaking. | [deleted] | karmasimida wrote: | It is simultaneously impressive and underwhelming for me. | | I mean yes this is a super impressive demo, but it didn't go | beyond my expectation. I really want to see whether this model | can write a correct binary search method without seeing one | before. | | Or even correctly using the binary search, does it understand | concept like index boundaries? | stavros wrote: | > I really want to see whether this model can write a correct | binary search method without seeing one before. | | I don't believe the model was trained on Google interview | answers, sadly. | polyanos wrote: | I found the whole UI/sandbox they created the most | interesting part. Now don't get me wrong, the tech is | certainly great and all, but I really didn't had the feeling | I watched/learned more than I already knew from what was | shown with Github CoPilot, although I was kinda impressed, if | it really is as simple as they stated, at how it is able to | adapt to new apis. | | It's a shame they only limited the demo to relatively simple | instructions. ___________________________________________________________________ (page generated 2021-08-10 23:00 UTC)