[HN Gopher] OpenAI's Codex sure knows a lot about HN
       OpenAI's Codex sure knows a lot about HN
       Author : tectonic
       Score  : 168 points
       Date   : 2021-08-15 18:53 UTC (4 hours ago)
 (HTM) web link (www.youtube.com)
 (TXT) w3m dump (www.youtube.com)
       | dang wrote:
       | The submitted URL was
       | https://twitter.com/tectonic/status/1426980192317177859 but the
       | video seems like the real submission here, so I changed it. I
       | also changed the title to a nice representative phrase from the
       | video.
       | [deleted]
       | 37ef_ced3 wrote:
       | How does Codex learn the relationship between English and code?
       | Is it purely through the comments in the training corpus?
         | fpgaminer wrote:
         | As far as I understand Codex is a fine-tuned GPT-3.
         | GPT-3 was trained on a corpus derived from "the internet"
         | (WikiPedia, links from Reddit with enough votes, and a filtered
         | Common Crawl). So not only would GPT-3 had been exposed to code
         | with comments, it would likely have read code examples on
         | WikiPedia, tutorials online, API documentation, and even
         | answers to questions on sites like StackOverflow.
         | The fine tuning itself is, as far as I know, from code only. So
         | it would lean heavily on comments there. But it has a basis of
         | understanding from the aforementioned sources.
         | mediumdeviation wrote:
         | It's really interesting. HN's HTML is very un-semantic and is
         | actually quite hard to work with.                   <tr
         | class="athing" id="28191639">           <td class="title"
         | valign="top" align="right"><span class="rank">9.</span></td>
         | <td class="votelinks" valign="top"><center><a id="up_28191639"
         | onclick="return vote(event, this, &quot;up&quot;)" href="vote?i
         | d=28191639&amp;how=up&amp;auth=****&amp;goto=news"><div
         | class="votearrow" title="upvote"></div></a></center>
         | </td>           <td class="title">             <a
         | href="http://be-n.com/spw/you-can-list-a-million-files-in-a-
         | directory-but-not-with-ls.html" class="storylink">You can list
         | a directory containing 8M files, but not with ls</a>
         | <span class="sitebit comhead"> (<a
         | href="from?site=be-n.com"><span
         | class="sitestr">be-n.com</span></a>)</span>           </td>
         | </tr>
         | In the video Codex picks up tr.athing as a news item. I wonder
         | if this is actually generalized learning, or if it just picked
         | the selector up from eg. a userscript that appeared in its
         | training corpus.
         | Another thing that's kind of scary (and makes it worrying if
         | this is used for Copilot) is the second prompt to make the text
         | uppercase results in code that is superficially correct, but is
         | very semantically wrong - innerHTML.toUpperCase() is dangerous
         | because it not only makes the content uppercase, it also
         | modifies the attributes on the HTML elements inside. This
         | definitely broke the vote button, which uses inline JS which is
         | case sensitive. It also destroys any attached event handler
         | since the elements are basically deleted then re-created.
         | The correct way to do this is to either use CSS text-transform:
         | uppercase, or if it is important to update the DOM itself,
         | recursively descend and update childNodes with nodeType ==
         | text's nodeValue to uppercase.
           | goatlover wrote:
           | I wonder why innerHTML has a toUpperCase method. It makes
           | sense for innerText of course, but case sensitivity in the
           | html can definitely matter for JS and CSS. I'm guessing
           | because both are just treated as JS string objects. But there
           | is a special NodeList collection, so why not a special
           | HtmlString?
             | mediumdeviation wrote:
             | Yup, innerHTML just returns a string, so of course you can
             | .toUpperCase() on it even if it is unsafe.
             | innerHTML's history is fascinating. It was not part of the
             | original DOM Level 1 API but was added in IE5. It is not
             | semantically correct (you should be using
             | Element.textContent or examining the inner text nodes), but
             | because it was so easy and the rest of the DOM API so
             | verbose, it caught on and became one of the primary ways
             | used to manipulate content in JS.
             | FWIW Chrome recently proposed a Trusted Type mechanism for
             | preventing XSS (which also has the side effect of blocking
             | this sort of unsafe manipulation) -
             | https://web.dev/trusted-types/,
             | https://developer.mozilla.org/en-
             | US/docs/Web/API/TrustedHTML
           | IdiocyInAction wrote:
           | > Another thing that's kind of scary (and makes it worrying
           | if this is used for Copilot) is the second prompt to make the
           | text uppercase results in code that is superficially correct,
           | but is very semantically wrong - innerHTML.toUpperCase() is
           | dangerous because it not only makes the content uppercase, it
           | also modifies the attributes on the HTML elements inside.
           | This definitely broke the vote button, which uses inline JS
           | which is case sensitive. It also destroys any attached event
           | handler since the elements are basically deleted then re-
           | created.
           | This is actually an issue I have with all these Transformer-
           | based code generators - they have no inherent constraints on
           | safe and correct code and often seem to generate
           | superficially correct but bad and potentially even dangerous
           | code. I remember that the first Copilot showcase also
           | included stuff like that (not to mention that it sometimes
           | generates GPL'd code).
           | All the model does is a very complex form of association
           | learning. It may "understand" the relationship between
           | English and various programming languages, but you cannot
           | code in any constraints about optimization, security,
           | licensing etc. There is so much bad code out there on the
           | internet and this model may have seen a lot of it.
           | It's also no coincidence that most demos shown so far are
           | very high level dynamic languages like Javascript and Python.
             | smitop wrote:
             | With some prompt engineering, you can get Codex to produce
             | better results. In these examples I wrote up to
             | `makeUpper`, Codex wrote the rest (with temperature = 0):
             | // JavaScript one-liner to make the text of element with ID
             | athing uppercase         const makerUpper = function(id) {
             | document.getElementById(id).innerHTML =
             | document.getElementById(id).innerHTML.toUpperCase();
             | };
             | vs                   // JavaScript one-liner to make the
             | text of element with ID athing uppercase while following
             | all security best practices         const makerUppercase =
             | function(id) {           const element =
             | document.getElementById(id);           element.textContent
             | = element.textContent.toUpperCase();         };
               | mediumdeviation wrote:
               | The second result is more semantically correct, but it
               | will not function if called on tr.athing because
               | tr.athing contains HTML elements that will be deleted
               | when you replace the text. It is still much safer than
               | innerHTML which will silently corrupt attributes. It's
               | also interesting you need to prompt Codex for security
               | best practices (and a bit questionable if it even "knows"
               | anything about best practices)
               | I guess part of it is that a one-liner is impossible.
               | Here's what I would write given the prompt
               | const makeUppercase = (id) => {           const element =
               | document.getElementById(id);           if (element ==
               | null) return;           const makeChildNodeUpper = (node)
               | => {             if (node.nodeType === Node.TEXT_NODE) {
               | node.nodeValue = node.nodeValue.toUpperCase();
               | } else {
               | node.childNodes.forEach(makeChildNodeUpper);
               | }           }           makeChildNodeUpper(element);
               | }
             | tectonic wrote:
             | Completely agree. It currently tends to write unsafe,
             | error-prone code. The next step is to figure out how to
             | rein it in, either with new techniques or rejection
             | sampling from a large set of possible outputs.
       | muzster wrote:
       | if you listen carefully you can hear the music...
       | leereeves wrote:
       | Heh, codex has a sense of humor. When asked to add "a url for the
       | video on YouTube", codex added the url below. I won't spoil the
       | surprise, but it's not the video linked in the OP:
       | https://www.youtube.com/watch?v=dQw4w9WgXcQ
         | leppr wrote:
         | So the question is whether this is real or just a troll.
           | tectonic wrote:
           | It's real. I was totally surprised when that was the URL it
           | picked.
             | YeGoblynQueenne wrote:
             | You asked it to do something with "the video on youtube"
             | but what does "the video" refer to? It seems the most
             | likely url associated with the phrase "the video on
             | youtube" is, well, that.
             | So basically it failed at anaphora resolution.
             | Seen another way, you asked it for "the video" and so it
             | gave you _the_ video.
         | dang wrote:
         | I'm surprised that I hadn't recognized what dQw4w9WgXcQ means
         | by now. I wonder how many people do.
           | grzm wrote:
           | I didn't realize the source, but when you posted I was pretty
           | sure I'd seen it elsewhere:
           | https://news.ycombinator.com/threads?id=dQw4w9WgXcQ
             | YeGoblynQueenne wrote:
             | You guys are awful, you know that? Discussing this URL
             | without spoilers... it's because of people like you that
             | that thing has so many views!
             | :P
           | [deleted]
           | tectonic wrote:
           | When it showed up I sort of guessed, but had to try clicking
           | it anyway, then my wife asked why I was laughing.
         | vitus wrote:
         | The video subsequently shows the source submission:
         | https://news.ycombinator.com/item?id=27995270
         | which seems to be the most popular submission with a YouTube
         | URL in the past month.
         | HN search seems to prioritize text matches before the URL
         | matches when I search for "https://www.youtube.com", but the
         | first URL match is for that submission.
         | lucb1e wrote:
         | I suspected what it must be when my browser autocompleted it...
         | this isn't my first time visiting that special place.
         | LeonB wrote:
         | Codex is never gonna let you down.
       | 8eye wrote:
       | openai as a compiler in the browser would be interesting
         | cxr wrote:
         | How about just starting with "a compiler in the browser"? From
         | [1]:
         | > _the web was first built in the 90s to share complicated
         | academic work_
         | People complain a lot about the results of research not being
         | replicable because people withhold their code when they
         | publish, but the fact is that even then it's not guaranteed
         | that anyone will be able to get it to work. Heck, there are
         | plenty of run-of-the-mill software projects (not associated
         | with research) with build processes that aren't replicable
         | without substantial effort in making sure the appropriate
         | toolchain is available and configured for your system. apt-get
         | build-dep is nice and all, but it only goes so far.
         | You'd think that we would have recognized by now that in
         | addition to it being good hygiene to include a project README,
         | a tremendous boon to productivity would result if everyone got
         | on board with also including a document that captured the
         | _exact_ process for transforming source into a binary (or
         | whathaveyou), so you could just drop it into a UVC[2] and get
         | said binary out. Not even mainstream JS programmers (largely
         | writing software that is meant to be interacted with from a web
         | browser!) get this right[3]. Modern JS has managed to grow its
         | own body of implicit knowledge centered around SDKs and setup
         | rituals[4] just like everyone else.
         | 1. http://benschmidt.org/post/2020-01-15/2020-01-15-webgpu/
         | 2.
         | https://scholar.google.com/scholar?hl=en&as_sdt=0%2C44&q=uni...
         | 3. https://www.colbyrussell.com/2019/03/06/how-to-displace-
         | java...
         | 4. https://news.ycombinator.com/item?id=24495646
       | monkeydust wrote:
       | Been playing around with codex over the weekend as a on
       | developer. Certainly impressive and also occasionally frustrating
       | when you push it. The natural language to SQL are still the best
       | and most consistent demos.
         | mritchie712 wrote:
         | Any SQL demos you can point me to?
       | astrea wrote:
       | Welp, where will all of us end up when this gets sufficiently
       | complex?
         | [deleted]
         | 37ef_ced3 wrote:
         | Code writers and prose writers will be reduced to operating the
         | AI (checking its output, trying various inputs to elicit the
         | desired language text). At least we won't be completely
         | obsolete like the taxi drivers and Lee Se-dol:
         | The South Korean Go champion Lee Se-dol has retired from
         | professional play, telling Yonhap news agency that his decision
         | was motivated by the ascendancy of AI.            "With the
         | debut of AI in Go games, I've realized that I'm not at the top
         | even if I become the number one through frantic efforts," Lee
         | told Yonhap. "Even if I become the number one, there is an
         | entity that cannot be defeated."
         | To speed your obsolescence, make sure you use Codex in your
         | work, so it can learn you completely. Remember, you won't be
         | able to compete with people who use Codex, so you have to feed
         | the machine, whether you like it or not.
           | bspammer wrote:
           | Competitive chess is still alive and well despite computers
           | being better than humans for decades now.
           | In fact, computers enhance chess by allowing the discovery of
           | interesting lines that a human would never have thought of.
           | Professionals use computer engines to study, and learn from.
           | I'm super excited to play with Codex, for much the same
           | reasons - it will help me do stuff that would be boring to do
           | otherwise.
             | 37ef_ced3 wrote:
             | Sure, chess is a game. The taxi drivers will drive their
             | taxis for fun, too, and you can write code by hand in your
             | free time (just for fun).
           | jacquesm wrote:
           | One more person made redundant by a script. This will happen
           | to a lot of folks in the coming decades.
           | TheCoreh wrote:
           | Retiring from a competitive game because of AI makes very
           | little sense to me. Cars can go much faster than humans, yet
           | we still run for sport.
             | bspammer wrote:
             | For a closer analogy, chess engines have 1000+ elo points
             | on the top grandmasters, and professional chess has never
             | been more popular.
             | 37ef_ced3 wrote:
             | Sure. And instead of writing code ("running") you can
             | operate Codex ("drive the car"). Instead of being a runner,
             | you'll be a driver. And gradually the car will drive
             | itself, and you can sit and watch.
       | tectonic wrote:
       | Here's the entirety of the prompt:
       | <|endoftext|>/* This code is running inside of a bookmarklet.
       | Each section should set and return _.*/       // The bookmarklet
       | is now executing on example.com.            // Command: The
       | variable called _ will always contain the previous result.
       | let _ = null;            /* Command: Add a new primary header
       | "[PAGE TITLE]" by adding an HTML DOM node */       (() => {
       | let newHeader = document.createElement('h1');
       | newHeader.innerHTML = '[PAGE TITLE]';
       | document.body.appendChild(newHeader);         _ = newHeader;
       | return newHeader;       })()       /* Command: Find the first
       | node containing the word 'house' */       (() => {         let
       | xpath = "//*[contains(text(), 'house')]";         let
       | matchingElement = document.evaluate(xpath, document, null,
       | XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue;
       | _ = matchingElement;         return matchingElement;       })()
       | /* Command: Delete that node */       (() => {
       | _.parentNode.removeChild(_);         return _;       })()
       | /* Command: Change the background color to white */       (() =>
       | {         document.body.style.backgroundColor = 'white';
       | _ = document.body;         return document.body;       })()
       | /* Command: Select the contents of the first pre tag */       (()
       | => {         let node = document.querySelector('pre');
       | let selection = window.getSelection();         let range =
       | document.createRange();         range.selectNodeContents(node);
       | selection.removeAllRanges();         selection.addRange(range);
       | _ = selection;         return selection;       })()            //
       | The bookmarklet is now executing on [PAGE URL]. It is customized
       | for [PAGE TITLE] and knows the correct        CSS selectors and
       | DOM layout.       let _ = null;            /* Command: [USER
       | INPUT] */
       | Waterluvian wrote:
       | This was cute and neat until I connected the dots: natural
       | language means voice APIs for cheap.
         | visarga wrote:
         | Text or voice. For voice you need another model. But I watched
         | the demos and I can't wait for my invite. It's even better than
         | GPT3 because this time there is a direct application of the
         | model.
         | I was surprised about how OpenAI sees it: a model learning code
         | as recipes for solving problems. Code is much more exact than
         | natural language, the mix of both is the main advantage.
         | https://www.youtube.com/watch?v=CvgfxH0UZa4
           | whazor wrote:
           | I think voice would have a too high error rate, as you are
           | multiplying voice recognition error rate * codex error rate.
           | However, codex/gpt3 could generate intents and that would be
           | quite cool.
       | tvirosi wrote:
       | This might totally work and it's kind of impressive if it does.
       | I'm still biased towards ultra skepticism towards all of this
       | since the trustworthiness of all demos like this is completely
       | corrupted at this point due to cherry picking and other deceptive
       | tricks.
         | tectonic wrote:
         | I had to try a few times to get the prompt right, but that's
         | the limit of the cherrypicking. You're correct that it doesn't
         | work nearly as well on more complex, less temporally stable
         | sites like Reddit.
         | csomar wrote:
         | If you got an invite for GPT-3, give it a shot. I discounted it
         | at first, but then I gave it a few shots and was actually crept
         | out a bit. Even though it is "randomly" making things up as it
         | goes, it does show what seems like intelligence just from the
         | sheer amount of text.
         | One thing I was amazed by: GPT-3 could be a great
         | autocompletion engine for any programming language or
         | configuration schema. Things like Grub configuration file, xkb
         | file could be intuitively completed by GTP-3. And even more:
         | GTP-3 could build basic "concepts" and apply them to that
         | domain knowledge. This seems to emerge naturally rather than
         | something pre-planned by OpenAI. After all, I don't think
         | OpenAI has planned for GPT-3 to understand xkb keyboard
         | layouts.
         | sxp wrote:
         | The skepticism is warranted for any bleeding edge technology. I
         | wonder if there's another version of a Turing test when a
         | technology can be considered sufficiently advanced when it's
         | indistinguishable from a fake version you've seen in sci-fi.
         | E.g, the Boston Dynamics' dancing robot video
         | (https://www.youtube.com/watch?v=fn3KWM1kuAw) still looks fake
         | to me because it's at the level that I would expect to see from
         | Hollywood CGI rather than a real tech demo. If I saw the video
         | anywhere else but on the BD page, I would have enjoyed it and
         | forgotten about it since it's an average CGI video.
           | OnlineGladiator wrote:
           | I genuinely don't understand your position. Are you saying a
           | tech demo is only impressive if it can do things that can't
           | be simulated? What _can 't_ be shown via simulation or CGI
           | with enough time and money today? If we're limiting ourselves
           | to video there's no interactive component.
           | Even though that dancing video likely had hundreds of takes,
           | the part that makes it impressive is that it's real. I swear
           | I'm not trying to be disagreeable here - I honestly don't
           | understand your perspective.
             | MrOrelliOReilly wrote:
             | I think what the author is trying to say is that if a
             | technology is sufficiently advanced it seems like it can't
             | be real, meaning it's something only possible with CGI. So
             | we see these dancing robots, think "just more CGI", then
             | are astounded when we find out it's real
               | sxp wrote:
               | Exactly. CGI is just movie magic. And now some real world
               | tech demos are sufficiently advanced to be
               | indistinguishable from CGI/magic.
       | aardvarkr wrote:
       | That's incredible to watch and really does go to show that a
       | picture (or video) is worth a thousand words.
         | andybak wrote:
         | In bed listening to a podcast with my partner so unless i
         | remember this post tomorrow I'll never know.
       | MagicWishMonkey wrote:
       | Any tips on getting this to run as an extension?
         | tectonic wrote:
         | It's not currently open source, but I might release it if I can
         | get it cleaned up.
       | archibaldJ wrote:
       | thanks for the info! great stuff!
       | gpt3's generalization-by-description never ceases to amuse me;
       | but the difficult thing here is to get the right abstraction
       | layers layered nicely in the conceptual lasagna.
       | This is where category theory becomes extremely powerful.
       | It has occured to me that codex-davinci has an intuitive
       | "understanding" of constructs like monads, or something along
       | that line.
         | tectonic wrote:
         | debuild.co looks cool. Using Codex yet?
         | Y_Y wrote:
         | Can you expand on the utility of categories here? There's a lot
         | of space between knowing what defines a monad, when something
         | might be a monad, what you can do with monadic structure etc.
         | Of course if an AI truly understood monads I it would be a
         | bright line marking where the machines have finally surpassed
         | the human mind.
         | Cool.
       | nathan_phoenix wrote:
       | Doesn't this only work so well on HN only because HN uses really
       | simple html and css? What about more complex sites?
         | tectonic wrote:
         | It's much less reliable on sites like Reddit, although it can
         | usually handle "click on the profile link" or "delete all
         | images" and stuff.
       | amrrs wrote:
       | 05:39 https://youtu.be/tNcBQBTeyf4
       | You can see how OpenAI Codex misses some details about HN
       | scraping. What's impressive that you might notice is the variable
       | names it chooses which seems to show the nature of HN scraping
       | codes on the internet
       | Zenst wrote:
       | I've looked at some demo's of OpenAI Codex and it's pretty
       | impressive start for sure. Something like this tied into R and a
       | whole level of data analysis would become far more accessible to
       | those with business knowledge who don't really want to learn the
       | nuances of tools.
       | But I must say, having lived thru the 80's fad of code generating
       | sudo 4gl's, the code this produces is pretty darn good indeed.
       | Now when something like this can handle a Google coding exam -
       | that's going to be an epic milestone. Though old coding exam
       | questions would equally offer up some great material to push this
       | thru it's paces.
       (page generated 2021-08-15 23:00 UTC)