hngopher.com

       [HN Gopher] Copilot Internals
       ___________________________________________________________________
        
       Copilot Internals
        
       Author : jjwiseman
       Score  : 168 points
       Date   : 2022-12-17 22:18 UTC (1 days ago)
        
 (HTM) web link (thakkarparth007.github.io)
 (TXT) w3m dump (thakkarparth007.github.io)
        
       | albertzeyer wrote:
       | This is about the VSCode extension, which is obfuscated (maybe
       | compiled) JS.
       | 
       | The plugin for IntelliJ (PyCharm etc), is this written in Java?
       | Reverse compiling this might give some additional insights.
        
         | fortenforge wrote:
         | Their JetBrains plugin is written in Kotlin / Java but it spins
         | up a agent server written in node.js which handles the business
         | logic (building the prompt, caching, making completion requests
         | to their API). I assume most of the code is shared between the
         | VSCode extension and this javascript agent.
        
       | gavinray wrote:
       | I stumbled upon this repo by accident a few days ago when its
       | source code appeared to contain the only usage the term
       | "GH_COPILOT_TOKEN" in any repo on Github:
       | 
       | https://github.com/search?q=GH_COPILOT_TOKEN&type=code
       | 
       | (My Copilot was broken and this was in the error output I was
       | seeing, see:
       | https://github.com/community/community/discussions/41878)
       | 
       | What I found there was some truly impressive reverse-engineering
       | work by a single individual. I really like the "JOURNAL" daily-
       | diary they kept of progress and random thoughts so you could see
       | the progression day-by-day.
       | 
       | --------
       | 
       | One thing I found interesting: The author says that it queries
       | only the 20 recently most opened files of the same language.
       | 
       | But in an AMA, I asked about how much "context" Copilot has
       | available, and one of the devs says it can, for example, read
       | header files that pair with C/C++ files that are open in separate
       | tabs:
       | 
       | https://github.com/orgs/community/discussions/29932#discussi...
       | > "I assume Copilot uses the contents of the current project (IE,
       | all files) as contextual information to offer suggestions. Is
       | this the case?"            > "Yes, copilot looks at what we call
       | "related tabs", for example .c files are often paired with .h
       | files, so copilot considers the contents of other tabs open in
       | the editor to try to feed a more complete prompt to the model."
        
       | vbezhenar wrote:
       | My biggest issue with Copilot is that it only wants to add code.
       | 
       | That's useful but I edit code a lot. And if I have 10 similar
       | lines and made one edit, it'd be very convenient for Copilot to
       | suggest edit following line or even lines.
        
         | ftufek wrote:
         | You can look into Copilot Labs extension in vscode, it does the
         | editing and bunch of other stuff too (like explain what the
         | highlighted code does, etc). It's not as smooth, but it's
         | getting there.
        
         | TillE wrote:
         | That's interesting, even IntelliCode (generally less capable
         | than Copilot afaik) will do exactly that. I've had it trigger a
         | few times in C# recently, where I make one or two similar
         | edits, and it prompts me to make more.
        
       | melony wrote:
       | Why does "cushman-ml" suggest a 12B model instead of the 175B
       | model?
        
         | varunkmohan wrote:
         | Most likely latency and cost reasons. A model that's 10x as big
         | requires 10x the hardware to serve at the same latency. Since
         | most generations are not too long, a smaller finetuned model
         | should work well enough.
        
         | peaslock wrote:
         | The model with the most similar name in this list is code-
         | cushman-001 which is described as "Codex model that is a
         | stronger, multilingual version of the Codex (12B) model in the
         | paper".
         | 
         | https://crfm-models.stanford.edu/static/help.html
         | 
         | The next stronger Codex model is called code-davinci-001 which
         | appears to be a fine-tuned version of the GPT-3 Davinci model
         | which is known to have 175B parameters. The model naming is
         | alphabetical in the order of the model size:
         | 
         | https://blog.eleuther.ai/gpt3-model-sizes/
         | 
         | See also A.2 here: https://arxiv.org/pdf/2204.00498.pdf#page=6
        
           | alextheparrot wrote:
           | Code is the base model in more recent iterations [0]
           | 
           | [0] https://beta.openai.com/docs/model-index-for-researchers
        
       | dj_mc_merlin wrote:
       | Huh, was this post revitalized? I remember seeing it (and
       | upvoting it) in /new yesterday, but it didn't reach critical mass
       | for the front page. Seems to be gone now.
        
       | bluelightning2k wrote:
       | Just wanted to say great job on the analysis and explanation!
        
       | modeless wrote:
       | Free idea for GitHub: a huge bit of missing context for the model
       | right now is the last few edits made by the user. If you move
       | your cursor to a different part of a long file, Copilot
       | immediately forgets about that part of the file. If it knew the
       | last few edits you did then it would be able to make much more
       | intelligent suggestions based on the task you're working on,
       | rather than just current cursor position.
        
         | letitgo12345 wrote:
         | Not sure how easy it would be to make work. Code edit data is
         | not that prevalent. The best I can think of is looking at
         | github commit changes. That's one place where Repl.it has a big
         | advantage as it has live editing data from its users
        
           | modeless wrote:
           | They could start by simply including the code around previous
           | cursor positions as additional context the same way they do
           | with code from other files. Nothing specific to the edits
           | themselves. That alone would help a lot I think. Maybe they
           | already do but I don't think so based on the behavior I see,
           | and this article doesn't mention anything like that.
           | 
           | But Copilot is getting tons of live editing data from its
           | users too, and soon should be able to construct a nice
           | dataset of edits. There's no way they aren't already doing
           | that.
        
       | peaslock wrote:
       | Amazing if this is only a 12B model. If this already increases
       | coding productivity by up to 50% (depending on kind of work),
       | imagine what a 1T model will be capable of! I do wonder if some
       | programmers at FAANG are already having access to a way more
       | powerful coding assistants, and whether they code much at all at
       | this point, or only make high level code specifications and then
       | fix up the automatically generated code.
        
         | varunkmohan wrote:
         | A 1T model would be capable of much more than what the current
         | version of Copilot in terms of autocompletion and even code
         | correction. However, at that point, even with a lot of model
         | parallelism to speedup inference, it's likely to be atleast 10x
         | slower on the generation side. From my experience working on
         | Codeium, a Copilot alternative, this would be too frustrating
         | for users. It could be useful as a tool that runs
         | asynchronously that modifies all your code at scale.
        
         | tjoff wrote:
         | > _If this already increases coding productivity by up to 50%
         | (depending on kind of work)_
         | 
         | Does anyone believe that?
         | 
         | edit: I'm surprised to see that (so far) 3 replies actually
         | agree with the statement. Is there a video that you'd recommend
         | that shows realistic usage and gain from copilot? Maybe a
         | livestream or something.
        
           | insanitybit wrote:
           | Sure. I'm way more productive with Copilot. I haven't been
           | coding much lately but I could imagine it would double my
           | productivity with regards to the actual "get an
           | implementation of a thing done" bit of the work.
           | 
           | In terms of design, I had a long conversation with ChatGPT
           | the other day about designing a database, including
           | optimizations that could be made given certain requirements
           | and constraints, etc. It was a big productivity boost, like
           | rubber ducking on steroids.
        
             | BonoboIO wrote:
             | Can you give us an example how it helped to design the
             | database?
             | 
             | I could not think how it would have helped me, but maybe I
             | m limited in my imagination or don't know how to ask.
        
               | insanitybit wrote:
               | I told it I was designing a database. I told it that my
               | database could tolerate failure levels where more than a
               | quorum of nodes failed at a given time. I then asked it
               | about different algorithms for consensus; RAFT, Paxos,
               | swarm based, etc. It described algorithms for me. I told
               | it that in my database I could guarantee certain things,
               | like that every operation commutes, and I asked how that
               | would let me optimize things - it explained that I could
               | paralellize certain parts of those algorithms.
               | 
               | At one point I told it to name the algorithm we had been
               | discussing something like "OptSwim" and we just kept
               | iterating on the idea.
        
           | Kiro wrote:
           | Absolutely. 50% feels conservative. The thing is that Copilot
           | becomes so ingrained in your workflow that you don't notice
           | it until internet goes down and you feel completely
           | handicapped. Only then do you realize how much you rely on
           | it.
        
           | BiteCode_dev wrote:
           | On menial task, it's way more than 50%. For quick scripting,
           | dirty parsing, PoC and plumping, this is about 300% for me.
           | 
           | However, for anything that requires me to think, it's 5% at
           | best.
           | 
           | Don't take up the 50% figure as anything serious, I think
           | it's just a way to state "if it is a such a meaningful boost
           | in productivity".
           | 
           | Which it is, for a lot of tasks, because the vast majority of
           | programming jobs are boring stuff outside of the HN bubble.
           | 
           | It's amazing how much of the world economy runs on csv
           | uploaded to ftp servers.
        
             | karmasimida wrote:
             | It is indeed a cheap script boy for me as well
             | 
             | It does mundane work exceptionally well
        
         | gear54rus wrote:
         | 'fix up generated code' but do you agree that finding a mistake
         | (without even knowing if it's there) might be even harder than
         | writing from scratch?
        
           | jrockway wrote:
           | It's likely that programmers have this skill somewhere. We
           | all make mistakes when typing in code, and many of them do
           | get found. Some of them don't, that's what we call a bug. So
           | AI isn't exactly breaking any ground here.
           | 
           | I played with ChatGPT and asked it interview questions, and I
           | thought it was a pretty interesting exercise to find its
           | mistakes and get it to fix them. Good tool for training
           | interviewers, perhaps.
        
           | PetahNZ wrote:
           | We are doing this all the time anyway during code reviews.
        
         | whazor wrote:
         | In my eyes, the limitation of these models is that they only
         | fit a limited amount of context. Not the complete API of your
         | code base, or the latest version of the libraries you are
         | using. I also don't believe a bigger model would resolve these
         | limitations.
         | 
         | However, I do believe there could be a meta model that can
         | query code and libraries.
        
           | IshKebab wrote:
           | Presumably if you had access to them you could fine tune them
           | on your codebase.
        
             | peaslock wrote:
             | Yeah, continuous online learning by fine-tuning seems like
             | an obvious way of making these models recall information
             | from outside the perceptible context. One could also prompt
             | the model to (recursively) summarize code and prepend this
             | summary to each prompt, and/or enable the model to
             | interactively query function definitions or code summaries
             | before outputting a final answer (trained by RLHF). But any
             | such tricks might also quickly be outcompeted by an even
             | more general model, e.g. one that directly controls the GUI
             | and can communicate with coworkers...
        
         | karmasimida wrote:
         | Microsoft is FAANG level and beyond.
        
       ___________________________________________________________________
       (page generated 2022-12-18 23:00 UTC)