[HN Gopher] Show HN: A new way to use GPT-3 to generate code (an...
       ___________________________________________________________________
        
       Show HN: A new way to use GPT-3 to generate code (and everything
       else)
        
       Hi HN,  One of the things that frustrates me about Copilot is that
       all tasks posed to it must be in the form of a completion. By
       writing clever comments you can get it to generate a few lines of
       code or a short function body, but you never get coherent long-form
       generations just from mashing the tab key.  I'm working on a
       different approach. Instead of requiring you specify your code
       generation task through stilted comments, you can use GPT-3 to fill
       in what I call "instructional templates". They're like f-strings,
       except the English goes on the inside and the Python goes on the
       outside. Additionally, each instruction's location and surrounding
       context can aid in interpreting it, allowing instructions to be
       impressively terse.  I've collected 10 examples of the method on a
       Twitter thread here. Most code examples are in Python, but I also
       demonstrate generating CSV, NDJSON, R, Markdown, and HTML:
       https://twitter.com/goodside/status/1559801520773898240?s=21...
       [Edit: Clickable links in comment below.]  I also have a few
       examples of more creative, non-program output in HTML and Markdown
       in this thread:
       https://twitter.com/goodside/status/1560953991722418177?s=21...
       Interested in any feedback, especially from anyone who's tried to
       apply my method to their own problems.
        
       Author : goodside
       Score  : 189 points
       Date   : 2022-08-20 15:45 UTC (7 hours ago)
        
       | sltkr wrote:
       | Posting screenshots to Twitter has to be the least convenient way
       | to share code online (short of actual trolling).
        
         | goodside wrote:
         | I include OpenAI Playground links for all but the first several
         | of these, which capture not only the exact prompt but the
         | settings used in the generation. I don't use gists because you
         | need multiple, non-contiguous highlighted spans of text. Also
         | the Playground formatting is recognizable in a way that
         | establishes context quickly and invites people to read the
         | text.
        
           | fuzzythinker wrote:
           | FYI, I never click twitter links, that's 1 data point. Not
           | sure how many are like me here.
        
             | goodside wrote:
             | Thanks. I appreciate not everyone likes or is willing to
             | use Twitter, but I've yet to find a more convenient or
             | accessible channel for this content. I could start a proper
             | blog and worry about my own prompt formatting, but that
             | complicates my workflow a lot -- especially since most of
             | these results come directly from my phone in my
             | spare/travel time.
             | 
             | Eventually I may consolidate my better findings into a blog
             | post of some kind.
        
       | simonw wrote:
       | I've been following Riley on Twitter and he's a constant source
       | of fantastic GPT-3 tips, recommended:
       | https://twitter.com/goodside
        
         | goodside wrote:
         | Thanks! Your blog post on using GPT-3 dialog to explain SQL
         | queries was a big inspiration for me to start posting my
         | prompts publicly:
         | https://simonwillison.net/2022/Jul/9/gpt-3-explain-code/
        
       | overview wrote:
       | I just copy and pasted 3 random Leetcode problem prompts to
       | GPT-3. It successfully generated Python code that passed all test
       | cases for 2 out of the 3.
       | 
       | Problems passed: - Two sum - Text Justification edit: Newlines
        
       | [deleted]
        
       | terafo wrote:
       | Is it in your plans to do such thing with actually open
       | model?(something like Codegen from Salesforce or BLOOM).
        
         | goodside wrote:
         | No. The method relies heavily on the peculiar fine-tuning of
         | the InstructGPT line of models, which are trained specifically
         | to follow MTurk-style prose instructions. I imagine achieving
         | similar results using a non-InstructGPT model would be hard,
         | but I could be wrong.
        
       | ricardobayes wrote:
       | Just as cybersecurity analyst jobs are getting reduced to
       | comparing risk score numbers, maybe programming jobs will be
       | 'reviewing' machine-generated code in the future.
        
       | goodside wrote:
       | Clickable version of links:
       | 
       | Python, CSV, NDJSON, R, Markdown, and HTML examples:
       | https://twitter.com/goodside/status/1559801520773898240?s=21...
       | 
       | More creative, non-program output in HTML and Markdown:
       | https://twitter.com/goodside/status/1560953991722418177?s=21...
        
         | zaroth wrote:
         | Isn't there a different domain I can swap in so I can actually
         | read this in a browser without immediately getting the Twitter
         | "paywall" / login prompt?
        
           | [deleted]
        
           | easrng wrote:
           | Replace https://twitter.com/ with
           | https://farside.link/nitter/ to get redirected to a working
           | nitter instance.
        
           | lifthrasiir wrote:
           | In this case everything is in images, so the following should
           | work:
           | 
           | https://pbs.twimg.com/media/FaWHN2oXkAAPISD.jpg
           | https://pbs.twimg.com/media/FaWHN2nWQAEERHQ.jpg
           | https://pbs.twimg.com/media/FaWHN2pWYAIbJJ4.jpg
           | https://pbs.twimg.com/media/FaWHN2oX0AACDC6.jpg
           | 
           | https://pbs.twimg.com/media/FaYtV6WXEAkTG83.jpg
           | https://pbs.twimg.com/media/FaYtV6SX0AMHbYh.jpg
           | https://pbs.twimg.com/media/FaYtV6YXkAE0FtH.jpg
           | 
           | https://pbs.twimg.com/media/Fac0K14aAAIdU5p.jpg
           | https://pbs.twimg.com/media/Fac0K13UEAEb2eq.jpg
           | https://pbs.twimg.com/media/Fac0K1xUEAY_ZAR.jpg
           | https://pbs.twimg.com/media/Fac0K19akAUgzk1.jpg
           | 
           | https://pbs.twimg.com/media/Fai8ou4WAAApNaD.jpg
           | https://pbs.twimg.com/media/Fai8ou3WIAU2AOs.jpg
           | https://pbs.twimg.com/media/Fai8ovXVUAAE2K_.jpg
           | 
           | https://pbs.twimg.com/media/FajBEnPXwAAceIK.jpg
           | https://pbs.twimg.com/media/FajBEnuVsAA4885.jpg
           | https://pbs.twimg.com/media/FajBEnOWQAA9x7C.jpg
           | https://pbs.twimg.com/media/FajBEnwUcAADGMX.jpg
           | 
           | https://pbs.twimg.com/media/FajjhfWX0AAVBNg.jpg
           | https://pbs.twimg.com/media/FajjhfTWQAED_kB.jpg
           | https://pbs.twimg.com/media/FajjhfUWAAApBjT.jpg
           | https://pbs.twimg.com/media/FajjhfVXoAAKoK-.jpg
           | 
           | https://pbs.twimg.com/media/FalEEs-WAAAHo_J.jpg
           | https://pbs.twimg.com/media/FalEEs-WIAEtrUW.jpg
           | https://pbs.twimg.com/media/FalEEs_WIAAIAlP.jpg
           | 
           | https://pbs.twimg.com/media/FalQzNPX0AAFUvr.jpg
           | https://pbs.twimg.com/media/FalQzNNWYAAj4ed.jpg
           | https://pbs.twimg.com/media/FalQzNOXkAEFB5F.jpg
           | 
           | https://pbs.twimg.com/media/Falk4AyXEAArsb7.jpg
           | https://pbs.twimg.com/media/Falk4AyXgAEQUeO.jpg
           | 
           | https://pbs.twimg.com/media/Fal0dGKX0AIK4w6.jpg
           | https://pbs.twimg.com/media/Fal0dGGWQAABXxd.jpg
           | https://pbs.twimg.com/media/Fal0dGKXoAAiTqC.jpg
           | https://pbs.twimg.com/media/Fal0dGIWQAAo3GU.jpg
        
           | sltkr wrote:
           | Try replacing twitter.com with nitter.net or twiiit.com in
           | the URL.
           | 
           | The latter is a proxy for various Nitter instances, which is
           | useful if nitter.net itself is overloaded, which is happening
           | more and more these days because obviously it's such a useful
           | service.
        
           | jonnycomputer wrote:
           | nitter.net
        
       | howon92 wrote:
       | Thanks for sharing this. I've been playing around with GPT-3 for
       | a bit. Have you tried comparing this method to using the `insert`
       | mode in the Playground?
       | 
       | https://beta.openai.com/playground?mode=insert
       | 
       | On a side note, I learned that the limitation on the number of
       | tokens was often too restrictive to do anything fun with code
       | generation. Have you run into this issue too?
        
         | goodside wrote:
         | The Insert API is much less powerful, because you can infill
         | only a single location and you're limited to communicating the
         | infill content purely through context, without any instruction.
         | The Edit API is more directly adaptable to this, and
         | anecdotally it does seem to work but I can't vouch for its
         | reliability.
        
       | avinassh wrote:
       | Wow, I had no idea this could be done!
       | 
       | Since we are on the topic of code generation, I had a question. I
       | built this joke script called Rockstar [0] which generates fake
       | git commits resulting in fully green GitHub commit graph. In each
       | commit it adds gibberish, in the last commit adds a valid code. I
       | wanted to know if there's an easy way to generate realistic
       | looking code which I can use in each commit? I can't expect users
       | of the script to use OpenAI or any such API service. Something
       | which can be used to generate code locally would be sweet!
       | 
       | [0] - https://github.com/avinassh/rockstar
        
         | HasanYousef wrote:
         | This feels, not ethical.
        
           | O__________O wrote:
           | Might be worth explicitly staying you feel it's unethical so
           | it's possible to respond to your position.
           | 
           | I for one am able to think of numerous ethical ways it might
           | be of use.
        
             | trailingComma wrote:
             | Could you share them?
        
               | andy_xor_andrew wrote:
               | ethical use: you use it, and the squares on your
               | github.com profile, which is a website that hosts git
               | repos, turn green.
        
               | O__________O wrote:
               | While I would be the first to say those squares are
               | meaningless without context, if someone led others to
               | believe they were the result of anything other than
               | automatically generated posted and didn't have a
               | legitimate reason like the one I posted, that might
               | potentially be fraud, especially if it was used as a
               | basis for future economic exchanges, such as employment.
        
               | pertymcpert wrote:
               | Using GitHub green squares as a basis for offering
               | employment is as stupid as using the color of their hair.
               | Don't play stupid games if you don't want stupid prizes.
        
               | O__________O wrote:
               | Sure, for example, as way for a researcher to generate
               | realistic GitHub profiles to combine with realistic
               | resumes to do field research on labor market
               | discrimination.
               | 
               | Example of such research, which is common, ethical, and
               | completely legal to do:
               | 
               | https://www.shrm.org/resourcesandtools/hr-topics/talent-
               | acqu...
        
           | NavinF wrote:
           | Decorating the commit history calendar has been around since
           | forever: https://github.com/gelstudios/gitfiti
           | 
           | Fake code generators have been around since forever:
           | https://hackertyper.net/
           | 
           | Combining them seems like the logical next step.
        
             | michaelsshaw wrote:
             | HackerTyper isn't a generator, but is actually from the
             | Linux kernel. It's kernel/groups.c, but just a really old
             | version.
        
           | blagie wrote:
           | This feels ethical.
           | 
           | We have broken, emergent incentive structures in our system.
           | Tweaking those can be good or bad. It's not like a cheating
           | on a test.
           | 
           | If I were to modify it, I'd make it draw a picture :)
        
       | fotcorn wrote:
       | Seems like this also works with C++. This was my first time using
       | GPT3, and first attempt at this, no fine-tuning, no cherry
       | picking at all:                   Replace the $$dollar-quoted
       | expressions$$ with the content described therein, or follow the
       | directions they contain.              main.cpp         ```
       | $$c++ includes for output$$              $$Vec3 struct with three
       | float members x, y, z. constructor. dot product implementation.
       | $$              $$main method which allocates two vectors, then
       | prints the dot product of them to standard output$$         ```
       | Answer:              main.cpp         ```         #include
       | <iostream>              struct Vec3 {             float x, y, z;
       | Vec3(float x, float y, float z) : x(x), y(y), z(z) {}
       | float dot(const Vec3& v) const {                 return x * v.x +
       | y * v.y + z * v.z;             }         };              int
       | main() {             Vec3 a(1, 2, 3);             Vec3 b(4, 5,
       | 6);             std::cout << a.dot(b) << std::endl;         }
       | ```
        
         | TakeBlaster16 wrote:
         | I had no idea GPT-3 could write code! It seems to do pretty
         | well even with a one-sentence prompt:                 Write a
         | C++ program that computes the dot product of two three-
         | dimensional vectors.
         | 
         | ---                 #include <iostream>       #include <cmath>
         | using namespace std;              int main()       {
         | double x1, x2, x3, y1, y2, y3;                      //Get input
         | for vector 1           cout << "Enter x1: ";           cin >>
         | x1;           cout << "Enter y1: ";           cin >> y1;
         | cout << "Enter z1: ";           cin >> z1;
         | //Get input for vector 2           cout << "Enter x2: ";
         | cin >> x2;           cout << "Enter y2: ";           cin >> y2;
         | cout << "Enter z2: ";           cin >> z2;
         | //Calculate the dot product           double dot_product =
         | x1*x2 + y1*y2 + z1*z2;                      //Print the result
         | cout << "The dot product is: " << dot_product << endl;
         | return 0;       }
         | 
         | EDIT: It even does Rust, I'm blown away! (I'll forgive the
         | compile error, it's close enough)                 fn main() {
         | let v1 = vec![1, 2, 3];           let v2 = vec![4, 5, 6];
         | let dot_product = v1.iter().zip(v2).map(|(x, y)| x * y).sum();
         | println!("The dot product of {:?} and {:?} is {}", v1, v2,
         | dot_product);       }
        
           | goodside wrote:
           | In general, anything that has a "textbook" solution is easy.
           | What it's doing here is more recitation than synthesis. Where
           | it becomes harder, and where my method is necessary, is when
           | you need to specify the structure of the solution yourself
           | because the model can't get there on its own.
        
           | b20000 wrote:
           | the reason it does well is because you gave it a question
           | that might come up in coding interviews.
        
             | michannne wrote:
             | I think that's amazing on its own. Rather than having to do
             | a leetcode hard on the fly, you can just ask for the
             | solution. You'll still need to know the problem space in
             | order to properly optimize it, but it can save a lot of
             | time.
        
           | Max-q wrote:
           | It's incredible. I have used it to make several arduino
           | programs for me, controlling different hardware and
           | connecting them. It's just amazing.
        
         | goodside wrote:
         | Nice! In general these are better if you run them at the lowest
         | possible temperature. I.e., try temp=0 first for deterministic
         | output and then raise slowly if you need to cherry-pick a
         | better generation.
        
         | wyldfire wrote:
         | > Replace the $$dollar-quoted expressions$$ with the content
         | described therein, or follow the directions they contain.
         | 
         | Is this a natural language meta-instruction prompt to GPT-3? If
         | so, that seems kinda impressive. Does the model conceptualize
         | this sentence somehow, or 'merely' recognize similar prompts
         | from some specialized training?
        
           | sp332 wrote:
           | Here's one of the craziest examples I've seen. It's just a
           | page of instructions, and GPT-3 follows them https://mobile.t
           | witter.com/goodside/status/15575245464120524...
        
             | andrepd wrote:
             | I have just a question: how many prompts did it completely
             | fail for each prompt it answered brilliantly? 5? 10? 100?
             | That's what these "twitter threads showing gpt3 doing
             | astonishing things" fail to show.
        
               | goodside wrote:
               | Many! In this example, my question was explicitly, "How
               | many diverse tasks can I stack into a single generation
               | before it becomes unreliable?" If you scroll down in the
               | thread, I explain that these questions are on the "golden
               | path" of tasks GPT-3 does well. There are any number of
               | simple tasks I could have given it, like writing a
               | sentence backwards or summing a list of 10 numbers, where
               | it would fail every time.
        
               | nodespace wrote:
               | Is there any way to get it to respond the same way when
               | something is outside the golden path? So for example, if
               | you gave it the backwards sentence task, it would respond
               | with "I don't know how to do this" or really any way of
               | programatically evaluating that it failed, without
               | needing to know what the task itself was.
        
               | goodside wrote:
               | Yes. You can, with effort, condition it to respond
               | sensibly with phrases like "I'm sorry, I don't know how
               | to reverse strings," or "I'm sorry, I can't do any math
               | calculation that a human couldn't do in their head." But
               | in doing so you damage its ability to do some tasks it's
               | actually capable of, e.g. reciting a memorized answer to
               | "What is the fourth root of 625?" Its memorization
               | abilities are insane: It seems to know, for example, the
               | exact MD5 hashes of all single-character alphanumeric
               | strings. Much of the arithmetic it knows is probably
               | similarly memorized, and it's hard to clarify for it what
               | aspects of that memory are safe to use.
               | 
               | The initial problem that got me interested in GPT-3 is
               | suppressing confabulated answers to the Hofstadter-Bender
               | questions published in The Economist. I eventually found
               | an apparent solution but I'm yet to carefully validate
               | it: https://twitter.com/goodside/status/15564591218341683
               | 20?s=21...
        
               | janekm wrote:
               | Knowing whether or not it's giving you a sensible
               | response is one of the things that are hard for gpt-3,
               | unfortunately. It has no concept of failing.
        
               | goodside wrote:
               | On the contrary, doing a two-stage generation where the
               | second stage simply judges whether a generation is
               | correct can help a lot. It works even better if you give
               | it several generations and let it choose whichever is the
               | most truthful. I wrote a basic example of this here that
               | uses my own confabulation-suppressing prompt in the first
               | stage, but simpler variations of this exist in the
               | published literature: https://twitter.com/goodside/status
               | /1559586486705602562?s=21...
               | 
               | The hallucination-suppressing prompt it implicitly uses
               | the output of is here: https://twitter.com/goodside/statu
               | s/1556459121834168320?s=21...
        
             | [deleted]
        
           | wyager wrote:
           | > Does the model conceptualize this sentence somehow, or
           | 'merely' recognize similar prompts from some specialized
           | training?
           | 
           | I don't think the model is big enough to Chinese Room prompts
           | like this. It has to "conceptualize" them (in the least
           | loaded sense of the word that works).
        
       | renewiltord wrote:
       | This is a very clever piece of prompting. Thank you for the idea.
       | Great discovery!
        
       | yomkippur wrote:
       | should i cancel my sub to copilot?
       | 
       | alright how do i access gpt3 so i can write it like this?
        
         | CamperBob2 wrote:
         | More like, should I withdraw my college application, or at
         | least switch away from CS.
         | 
         | Obviously the answer is not only "No", but "Hell, no, this is
         | just starting to get interesting!" But a lot of young people
         | who are primarily in it for the coding career prospects should
         | probably reconsider.
        
         | goodside wrote:
         | The OpenAI API. I'm using text-davinci-002 (the default) for
         | all of these, with temperature=0 for reproducibility/quality.
        
       | bredren wrote:
       | Reminds me some of Django or jinja template tags.
       | 
       | Except the tag names are free form and the source is generated
       | rather than defined.
        
       | brunooliv wrote:
       | Is co-pilot now mandatory to be paid? I used a trial for a long
       | while, at least enough to miss it in my workflows once it went
       | paid. Is this really the case? Last I checked it was like 12
       | dollars a month? Is it possible for it still to be free or are
       | there free plug-ins for intellij that leverage it?
       | 
       | I loved it for some scaffolding, quick drafts or explore some
       | language features, but, not enough to pay monthly for it (yet!)
        
         | coding123 wrote:
         | The most comical thing about it going paid was all the hit
         | pieces about it went away.
        
           | kortilla wrote:
           | No, they are still there. Nothing has changed. Why would
           | there be new ones if nothing has changed?
           | 
           | GitHub (msft) is still just copy pasting code from other
           | projects with no regard for copyright. Whether it's paid or
           | not doesn't matter much.
        
       | andreyk wrote:
       | This is using standard prompting I think? It'd be neat to try
       | with the full 'fill in the blank' (where the blank is in the
       | middle of the input) generation technique LLMs can support, might
       | work even better!
        
         | goodside wrote:
         | This definitely isn't standard prompting -- as far as I know,
         | nobody was aware of this technique until I found it this past
         | week. Its precedent is the "format trick" that Boris Power (at
         | OpenAI) showed me, which is essentially my "instruction
         | templates" but where the only template form is an uninformative
         | numbered list. The trick is used purely as a way to
         | consistently format the output to multi-part tasks without
         | providing fully worked examples. As far as I know, it's never
         | been extended to contextually informative templates, or to
         | achieve code/document synthesis, as I've done here.
         | 
         | An example of the "format trick", roughly as Boris described it
         | to me, though in OpenAI's examples the answers were simply
         | numbered:
         | https://twitter.com/goodside/status/1556729629263933440?s=21...
         | 
         | If someone at OpenAI corrects me and says they've known about
         | my method, I'll stop claiming to have discovered it.
        
       | b20000 wrote:
       | if you take a coding interview and answer the question by feeding
       | it into GPT-3, does that mean you pass the interview? it must do
       | really well since all it takes is memorizing the solutions to a
       | large body of meaningless challenges.
       | 
       | the implication here is that if GPT-3 can solve your coding
       | question, you are hiring people good at memorizing solutions and
       | not skillfull engineers.
        
       | snapcaster wrote:
       | This is really cool, commenting so i can easily find this later
       | to check out
        
         | O__________O wrote:
         | (Unless your comment adds to the thread, please get familiar
         | with the how your profile links to "upvoted submissions" or
         | "favorite submissions" work -- or find another way to log post
         | of interest to you; other than a generic comment for yourself.)
        
       | potatoman22 wrote:
       | You should make this into a website generation playground
        
       ___________________________________________________________________
       (page generated 2022-08-20 23:00 UTC)