[HN Gopher] OpenAI's Codex sure knows a lot about HN ___________________________________________________________________ OpenAI's Codex sure knows a lot about HN Author : tectonic Score : 168 points Date : 2021-08-15 18:53 UTC (4 hours ago) (HTM) web link (www.youtube.com) (TXT) w3m dump (www.youtube.com) | dang wrote: | The submitted URL was | https://twitter.com/tectonic/status/1426980192317177859 but the | video seems like the real submission here, so I changed it. I | also changed the title to a nice representative phrase from the | video. | [deleted] | 37ef_ced3 wrote: | How does Codex learn the relationship between English and code? | | Is it purely through the comments in the training corpus? | fpgaminer wrote: | As far as I understand Codex is a fine-tuned GPT-3. | | GPT-3 was trained on a corpus derived from "the internet" | (WikiPedia, links from Reddit with enough votes, and a filtered | Common Crawl). So not only would GPT-3 had been exposed to code | with comments, it would likely have read code examples on | WikiPedia, tutorials online, API documentation, and even | answers to questions on sites like StackOverflow. | | The fine tuning itself is, as far as I know, from code only. So | it would lean heavily on comments there. But it has a basis of | understanding from the aforementioned sources. | mediumdeviation wrote: | It's really interesting. HN's HTML is very un-semantic and is | actually quite hard to work with. <tr | class="athing" id="28191639"> <td class="title" | valign="top" align="right"><span class="rank">9.</span></td> | <td class="votelinks" valign="top"><center><a id="up_28191639" | onclick="return vote(event, this, "up")" href="vote?i | d=28191639&how=up&auth=****&goto=news"><div | class="votearrow" title="upvote"></div></a></center> | </td> <td class="title"> <a | href="http://be-n.com/spw/you-can-list-a-million-files-in-a- | directory-but-not-with-ls.html" class="storylink">You can list | a directory containing 8M files, but not with ls</a> | <span class="sitebit comhead"> (<a | href="from?site=be-n.com"><span | class="sitestr">be-n.com</span></a>)</span> </td> | </tr> | | In the video Codex picks up tr.athing as a news item. I wonder | if this is actually generalized learning, or if it just picked | the selector up from eg. a userscript that appeared in its | training corpus. | | Another thing that's kind of scary (and makes it worrying if | this is used for Copilot) is the second prompt to make the text | uppercase results in code that is superficially correct, but is | very semantically wrong - innerHTML.toUpperCase() is dangerous | because it not only makes the content uppercase, it also | modifies the attributes on the HTML elements inside. This | definitely broke the vote button, which uses inline JS which is | case sensitive. It also destroys any attached event handler | since the elements are basically deleted then re-created. | | The correct way to do this is to either use CSS text-transform: | uppercase, or if it is important to update the DOM itself, | recursively descend and update childNodes with nodeType == | text's nodeValue to uppercase. | goatlover wrote: | I wonder why innerHTML has a toUpperCase method. It makes | sense for innerText of course, but case sensitivity in the | html can definitely matter for JS and CSS. I'm guessing | because both are just treated as JS string objects. But there | is a special NodeList collection, so why not a special | HtmlString? | mediumdeviation wrote: | Yup, innerHTML just returns a string, so of course you can | .toUpperCase() on it even if it is unsafe. | | innerHTML's history is fascinating. It was not part of the | original DOM Level 1 API but was added in IE5. It is not | semantically correct (you should be using | Element.textContent or examining the inner text nodes), but | because it was so easy and the rest of the DOM API so | verbose, it caught on and became one of the primary ways | used to manipulate content in JS. | | FWIW Chrome recently proposed a Trusted Type mechanism for | preventing XSS (which also has the side effect of blocking | this sort of unsafe manipulation) - | https://web.dev/trusted-types/, | https://developer.mozilla.org/en- | US/docs/Web/API/TrustedHTML | IdiocyInAction wrote: | > Another thing that's kind of scary (and makes it worrying | if this is used for Copilot) is the second prompt to make the | text uppercase results in code that is superficially correct, | but is very semantically wrong - innerHTML.toUpperCase() is | dangerous because it not only makes the content uppercase, it | also modifies the attributes on the HTML elements inside. | This definitely broke the vote button, which uses inline JS | which is case sensitive. It also destroys any attached event | handler since the elements are basically deleted then re- | created. | | This is actually an issue I have with all these Transformer- | based code generators - they have no inherent constraints on | safe and correct code and often seem to generate | superficially correct but bad and potentially even dangerous | code. I remember that the first Copilot showcase also | included stuff like that (not to mention that it sometimes | generates GPL'd code). | | All the model does is a very complex form of association | learning. It may "understand" the relationship between | English and various programming languages, but you cannot | code in any constraints about optimization, security, | licensing etc. There is so much bad code out there on the | internet and this model may have seen a lot of it. | | It's also no coincidence that most demos shown so far are | very high level dynamic languages like Javascript and Python. | smitop wrote: | With some prompt engineering, you can get Codex to produce | better results. In these examples I wrote up to | `makeUpper`, Codex wrote the rest (with temperature = 0): | // JavaScript one-liner to make the text of element with ID | athing uppercase const makerUpper = function(id) { | document.getElementById(id).innerHTML = | document.getElementById(id).innerHTML.toUpperCase(); | }; | | vs // JavaScript one-liner to make the | text of element with ID athing uppercase while following | all security best practices const makerUppercase = | function(id) { const element = | document.getElementById(id); element.textContent | = element.textContent.toUpperCase(); }; | mediumdeviation wrote: | The second result is more semantically correct, but it | will not function if called on tr.athing because | tr.athing contains HTML elements that will be deleted | when you replace the text. It is still much safer than | innerHTML which will silently corrupt attributes. It's | also interesting you need to prompt Codex for security | best practices (and a bit questionable if it even "knows" | anything about best practices) | | I guess part of it is that a one-liner is impossible. | Here's what I would write given the prompt | const makeUppercase = (id) => { const element = | document.getElementById(id); if (element == | null) return; const makeChildNodeUpper = (node) | => { if (node.nodeType === Node.TEXT_NODE) { | node.nodeValue = node.nodeValue.toUpperCase(); | } else { | node.childNodes.forEach(makeChildNodeUpper); | } } makeChildNodeUpper(element); | } | tectonic wrote: | Completely agree. It currently tends to write unsafe, | error-prone code. The next step is to figure out how to | rein it in, either with new techniques or rejection | sampling from a large set of possible outputs. | muzster wrote: | if you listen carefully you can hear the music... | leereeves wrote: | Heh, codex has a sense of humor. When asked to add "a url for the | video on YouTube", codex added the url below. I won't spoil the | surprise, but it's not the video linked in the OP: | | https://www.youtube.com/watch?v=dQw4w9WgXcQ | leppr wrote: | So the question is whether this is real or just a troll. | tectonic wrote: | It's real. I was totally surprised when that was the URL it | picked. | YeGoblynQueenne wrote: | You asked it to do something with "the video on youtube" | but what does "the video" refer to? It seems the most | likely url associated with the phrase "the video on | youtube" is, well, that. | | So basically it failed at anaphora resolution. | | Seen another way, you asked it for "the video" and so it | gave you _the_ video. | dang wrote: | I'm surprised that I hadn't recognized what dQw4w9WgXcQ means | by now. I wonder how many people do. | grzm wrote: | I didn't realize the source, but when you posted I was pretty | sure I'd seen it elsewhere: | | https://news.ycombinator.com/threads?id=dQw4w9WgXcQ | YeGoblynQueenne wrote: | You guys are awful, you know that? Discussing this URL | without spoilers... it's because of people like you that | that thing has so many views! | | :P | [deleted] | tectonic wrote: | When it showed up I sort of guessed, but had to try clicking | it anyway, then my wife asked why I was laughing. | vitus wrote: | The video subsequently shows the source submission: | https://news.ycombinator.com/item?id=27995270 | | which seems to be the most popular submission with a YouTube | URL in the past month. | | HN search seems to prioritize text matches before the URL | matches when I search for "https://www.youtube.com", but the | first URL match is for that submission. | lucb1e wrote: | I suspected what it must be when my browser autocompleted it... | this isn't my first time visiting that special place. | LeonB wrote: | Codex is never gonna let you down. | 8eye wrote: | openai as a compiler in the browser would be interesting | cxr wrote: | How about just starting with "a compiler in the browser"? From | [1]: | | > _the web was first built in the 90s to share complicated | academic work_ | | People complain a lot about the results of research not being | replicable because people withhold their code when they | publish, but the fact is that even then it's not guaranteed | that anyone will be able to get it to work. Heck, there are | plenty of run-of-the-mill software projects (not associated | with research) with build processes that aren't replicable | without substantial effort in making sure the appropriate | toolchain is available and configured for your system. apt-get | build-dep is nice and all, but it only goes so far. | | You'd think that we would have recognized by now that in | addition to it being good hygiene to include a project README, | a tremendous boon to productivity would result if everyone got | on board with also including a document that captured the | _exact_ process for transforming source into a binary (or | whathaveyou), so you could just drop it into a UVC[2] and get | said binary out. Not even mainstream JS programmers (largely | writing software that is meant to be interacted with from a web | browser!) get this right[3]. Modern JS has managed to grow its | own body of implicit knowledge centered around SDKs and setup | rituals[4] just like everyone else. | | 1. http://benschmidt.org/post/2020-01-15/2020-01-15-webgpu/ | | 2. | https://scholar.google.com/scholar?hl=en&as_sdt=0%2C44&q=uni... | | 3. https://www.colbyrussell.com/2019/03/06/how-to-displace- | java... | | 4. https://news.ycombinator.com/item?id=24495646 | monkeydust wrote: | Been playing around with codex over the weekend as a on | developer. Certainly impressive and also occasionally frustrating | when you push it. The natural language to SQL are still the best | and most consistent demos. | mritchie712 wrote: | Any SQL demos you can point me to? | astrea wrote: | Welp, where will all of us end up when this gets sufficiently | complex? | [deleted] | 37ef_ced3 wrote: | Code writers and prose writers will be reduced to operating the | AI (checking its output, trying various inputs to elicit the | desired language text). At least we won't be completely | obsolete like the taxi drivers and Lee Se-dol: | The South Korean Go champion Lee Se-dol has retired from | professional play, telling Yonhap news agency that his decision | was motivated by the ascendancy of AI. "With the | debut of AI in Go games, I've realized that I'm not at the top | even if I become the number one through frantic efforts," Lee | told Yonhap. "Even if I become the number one, there is an | entity that cannot be defeated." | | To speed your obsolescence, make sure you use Codex in your | work, so it can learn you completely. Remember, you won't be | able to compete with people who use Codex, so you have to feed | the machine, whether you like it or not. | bspammer wrote: | Competitive chess is still alive and well despite computers | being better than humans for decades now. | | In fact, computers enhance chess by allowing the discovery of | interesting lines that a human would never have thought of. | Professionals use computer engines to study, and learn from. | | I'm super excited to play with Codex, for much the same | reasons - it will help me do stuff that would be boring to do | otherwise. | 37ef_ced3 wrote: | Sure, chess is a game. The taxi drivers will drive their | taxis for fun, too, and you can write code by hand in your | free time (just for fun). | jacquesm wrote: | One more person made redundant by a script. This will happen | to a lot of folks in the coming decades. | TheCoreh wrote: | Retiring from a competitive game because of AI makes very | little sense to me. Cars can go much faster than humans, yet | we still run for sport. | bspammer wrote: | For a closer analogy, chess engines have 1000+ elo points | on the top grandmasters, and professional chess has never | been more popular. | 37ef_ced3 wrote: | Sure. And instead of writing code ("running") you can | operate Codex ("drive the car"). Instead of being a runner, | you'll be a driver. And gradually the car will drive | itself, and you can sit and watch. | tectonic wrote: | Here's the entirety of the prompt: | <|endoftext|>/* This code is running inside of a bookmarklet. | Each section should set and return _.*/ // The bookmarklet | is now executing on example.com. // Command: The | variable called _ will always contain the previous result. | let _ = null; /* Command: Add a new primary header | "[PAGE TITLE]" by adding an HTML DOM node */ (() => { | let newHeader = document.createElement('h1'); | newHeader.innerHTML = '[PAGE TITLE]'; | document.body.appendChild(newHeader); _ = newHeader; | return newHeader; })() /* Command: Find the first | node containing the word 'house' */ (() => { let | xpath = "//*[contains(text(), 'house')]"; let | matchingElement = document.evaluate(xpath, document, null, | XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue; | _ = matchingElement; return matchingElement; })() | /* Command: Delete that node */ (() => { | _.parentNode.removeChild(_); return _; })() | /* Command: Change the background color to white */ (() => | { document.body.style.backgroundColor = 'white'; | _ = document.body; return document.body; })() | /* Command: Select the contents of the first pre tag */ (() | => { let node = document.querySelector('pre'); | let selection = window.getSelection(); let range = | document.createRange(); range.selectNodeContents(node); | selection.removeAllRanges(); selection.addRange(range); | _ = selection; return selection; })() // | The bookmarklet is now executing on [PAGE URL]. It is customized | for [PAGE TITLE] and knows the correct CSS selectors and | DOM layout. let _ = null; /* Command: [USER | INPUT] */ | Waterluvian wrote: | This was cute and neat until I connected the dots: natural | language means voice APIs for cheap. | visarga wrote: | Text or voice. For voice you need another model. But I watched | the demos and I can't wait for my invite. It's even better than | GPT3 because this time there is a direct application of the | model. | | I was surprised about how OpenAI sees it: a model learning code | as recipes for solving problems. Code is much more exact than | natural language, the mix of both is the main advantage. | | https://www.youtube.com/watch?v=CvgfxH0UZa4 | whazor wrote: | I think voice would have a too high error rate, as you are | multiplying voice recognition error rate * codex error rate. | However, codex/gpt3 could generate intents and that would be | quite cool. | tvirosi wrote: | This might totally work and it's kind of impressive if it does. | I'm still biased towards ultra skepticism towards all of this | since the trustworthiness of all demos like this is completely | corrupted at this point due to cherry picking and other deceptive | tricks. | tectonic wrote: | I had to try a few times to get the prompt right, but that's | the limit of the cherrypicking. You're correct that it doesn't | work nearly as well on more complex, less temporally stable | sites like Reddit. | csomar wrote: | If you got an invite for GPT-3, give it a shot. I discounted it | at first, but then I gave it a few shots and was actually crept | out a bit. Even though it is "randomly" making things up as it | goes, it does show what seems like intelligence just from the | sheer amount of text. | | One thing I was amazed by: GPT-3 could be a great | autocompletion engine for any programming language or | configuration schema. Things like Grub configuration file, xkb | file could be intuitively completed by GTP-3. And even more: | GTP-3 could build basic "concepts" and apply them to that | domain knowledge. This seems to emerge naturally rather than | something pre-planned by OpenAI. After all, I don't think | OpenAI has planned for GPT-3 to understand xkb keyboard | layouts. | sxp wrote: | The skepticism is warranted for any bleeding edge technology. I | wonder if there's another version of a Turing test when a | technology can be considered sufficiently advanced when it's | indistinguishable from a fake version you've seen in sci-fi. | E.g, the Boston Dynamics' dancing robot video | (https://www.youtube.com/watch?v=fn3KWM1kuAw) still looks fake | to me because it's at the level that I would expect to see from | Hollywood CGI rather than a real tech demo. If I saw the video | anywhere else but on the BD page, I would have enjoyed it and | forgotten about it since it's an average CGI video. | OnlineGladiator wrote: | I genuinely don't understand your position. Are you saying a | tech demo is only impressive if it can do things that can't | be simulated? What _can 't_ be shown via simulation or CGI | with enough time and money today? If we're limiting ourselves | to video there's no interactive component. | | Even though that dancing video likely had hundreds of takes, | the part that makes it impressive is that it's real. I swear | I'm not trying to be disagreeable here - I honestly don't | understand your perspective. | MrOrelliOReilly wrote: | I think what the author is trying to say is that if a | technology is sufficiently advanced it seems like it can't | be real, meaning it's something only possible with CGI. So | we see these dancing robots, think "just more CGI", then | are astounded when we find out it's real | sxp wrote: | Exactly. CGI is just movie magic. And now some real world | tech demos are sufficiently advanced to be | indistinguishable from CGI/magic. | aardvarkr wrote: | That's incredible to watch and really does go to show that a | picture (or video) is worth a thousand words. | andybak wrote: | In bed listening to a podcast with my partner so unless i | remember this post tomorrow I'll never know. | MagicWishMonkey wrote: | Any tips on getting this to run as an extension? | tectonic wrote: | It's not currently open source, but I might release it if I can | get it cleaned up. | archibaldJ wrote: | thanks for the info! great stuff! | | gpt3's generalization-by-description never ceases to amuse me; | but the difficult thing here is to get the right abstraction | layers layered nicely in the conceptual lasagna. | | This is where category theory becomes extremely powerful. | | It has occured to me that codex-davinci has an intuitive | "understanding" of constructs like monads, or something along | that line. | tectonic wrote: | debuild.co looks cool. Using Codex yet? | Y_Y wrote: | Can you expand on the utility of categories here? There's a lot | of space between knowing what defines a monad, when something | might be a monad, what you can do with monadic structure etc. | | Of course if an AI truly understood monads I it would be a | bright line marking where the machines have finally surpassed | the human mind. | | Cool. | nathan_phoenix wrote: | Doesn't this only work so well on HN only because HN uses really | simple html and css? What about more complex sites? | tectonic wrote: | It's much less reliable on sites like Reddit, although it can | usually handle "click on the profile link" or "delete all | images" and stuff. | amrrs wrote: | 05:39 https://youtu.be/tNcBQBTeyf4 | | You can see how OpenAI Codex misses some details about HN | scraping. What's impressive that you might notice is the variable | names it chooses which seems to show the nature of HN scraping | codes on the internet | Zenst wrote: | I've looked at some demo's of OpenAI Codex and it's pretty | impressive start for sure. Something like this tied into R and a | whole level of data analysis would become far more accessible to | those with business knowledge who don't really want to learn the | nuances of tools. | | But I must say, having lived thru the 80's fad of code generating | sudo 4gl's, the code this produces is pretty darn good indeed. | | Now when something like this can handle a Google coding exam - | that's going to be an epic milestone. Though old coding exam | questions would equally offer up some great material to push this | thru it's paces. ___________________________________________________________________ (page generated 2021-08-15 23:00 UTC)