[HN Gopher] Replit's in-browser coding AI
       ___________________________________________________________________
        
       Replit's in-browser coding AI
        
       Author : bauerpl
       Score  : 121 points
       Date   : 2022-10-31 15:29 UTC (7 hours ago)
        
 (HTM) web link (replit.com)
 (TXT) w3m dump (replit.com)
        
       | aryamaan wrote:
       | I was on beta list used this for golang.
       | 
       | It blew my mind half of the times. It was like it knew what I was
       | going to do.
       | 
       | The other times it was dumber than the standard auto complete. It
       | doesn't have any awareness of already defined variables and
       | doesn't use them to complete halfwritten variables. Hope this
       | gets better soon.
        
         | _whiteCaps_ wrote:
         | Have you tried Copilot? That seemed to work very well for me.
        
           | aryamaan wrote:
           | Will give a try. I like Replit's ready to use key-value db
           | and also, a click deployment (running is deployment).
           | 
           | Overall, I like my Jetbrains IDE but Replit is coming with
           | appealing features for the side projects specially. Easy to
           | use auth, db, analytics, deployment...
           | 
           | I wish either Replit levels up its IDE game or Jetbrains(or
           | community) build plugins to match the state of art/joyful
           | experience of programming.
        
       | make3 wrote:
       | how is it better than copilot?
        
       | luxurytent wrote:
       | Does this differ from GitHub's CoPilot in any user-noticeable
       | way? (outside of the platform it's available on)
        
         | Liquid_Fire wrote:
         | One thing I noticed is it seems to have an "Explain code"
         | feature, giving you a textual explanation of a block of code
         | you select, which I'm not aware of GitHub having.
        
           | pynappo wrote:
           | GitHub Copilot Labs (a VSCode extension meant to be installed
           | with GitHub Copilot) seems to have an explain code feature.
           | 
           | ref: https://github.blog/2022-09-14-8-things-you-didnt-know-
           | you-c...
        
           | poulpy123 wrote:
           | TBH the example wasn't convincing at all
        
           | WithinReason wrote:
           | You can just use the prompt "The above code does the
           | following explained in English" in a comment to prompt
           | CoPilot to explain its code. You could probably also engineer
           | a prompt to translate code between languages.
        
       | maep wrote:
       | If this tool was trained on open source code, what license does
       | the generated code have? At least with Codepilot people were able
       | to generate verbatim GPL code with typos and everything. More
       | importantly, I wonder if companies behind these type of tools
       | offer legal or financial protections in case GPL code sneaks in
       | and leads to expensive law suits.
        
         | deworms wrote:
         | No they weren't able to generate the same existing code, both
         | because that code is not included anywhere in the model, and
         | because Copilot (not "Codepilot") has safeguards against this
         | kind of situation, should it arise in the highly unlikely
         | situation that a snippet is repeated thousands of times across
         | thousands of repositories.
         | 
         | I've gotta let you know that people copy code snippets from all
         | sorts of codebases with little regard for licenses anyway,
         | because they're toothless in 99% of cases, AI or not. It's a
         | nice illusion that anyone respects licenses, but it's just not
         | true.
        
           | NicoleJO wrote:
           | That's incorrect. CoPilot steals verbatim. Examples:
           | https://justoutsourcing.blogspot.com/2022/03/gpts-
           | plagiarism...
        
             | hnusersarelame wrote:
        
           | maep wrote:
           | I've spent hours looking over code before delivering to
           | FAANG. Our company had put a clause into the contract that
           | our code was free of any GPL'd code. It happened before and
           | it was discoved. The whole thing was a very expensive
           | excersice. I'm aware that many small startups, 90% of which
           | go bust anyways, just ignore licenses but that doesn't work
           | when you play with the big boys.
        
         | m00x wrote:
         | If you look at licensed code, then write new code, do you also
         | bring in those licenses?
         | 
         | It's been proved in court that AI does not infringe on
         | copyright or licenses since it generates things from an
         | understanding of the whole, instead of directly stealing, just
         | like the human brain does.
        
           | mr_toad wrote:
           | > If you look at licensed code, then write new code, do you
           | also bring in those licenses?
           | 
           | If the "new" code is close enough to be considered a derived
           | work then you will need a license.
        
             | MuffinFlavored wrote:
             | > If the "new" code is close enough to be considered a
             | derived work then you will need a license.
             | 
             | And how is that determined... in court at trial? By an
             | unbiased 3rd party competent enough to understand both
             | codebases?
        
           | mtlynch wrote:
           | > _It 's been proved in court that AI does not infringe on
           | copyright or licenses since it generates things from an
           | understanding of the whole, instead of directly stealing,
           | just like the human brain does._
           | 
           | Do you have a source for that?
           | 
           | This SF Conservancy article[0] says that's not true:
           | 
           | > _Consider GitHub's claim that "training ML systems on
           | public data is fair use". We have not found any case of note
           | -- at least in the USA -- that truly contemplates that
           | question._
           | 
           | The first major court case I know about is the class-action
           | case Matthew Butterick is trying to build.[1]
           | 
           | [0] https://sfconservancy.org/blog/2022/feb/03/github-
           | copilot-co...
           | 
           | [1] https://githubcopilotinvestigation.com/
        
           | krono wrote:
           | US Court rulings do not automatically apply worldwide, and
           | not everything it would apply to exists within its
           | jurisdiction.
        
           | Barrin92 wrote:
           | astonishingly enough every sentence in this post is untrue.
           | There's been no court case on any of the models in question
           | here. They don't work like human brains, nor understand
           | anything they output. Even if they did of course that output
           | would still be subject to licenses, given that human code is
           | subject to them, which is why those licenses exist in the
           | first place.
           | 
           | If you ever plan to steal someone's code and justify it with
           | "my brain is able to learn, therefore copyright doesn't
           | exist" I warn you right now this will not fly.
        
         | risyachka wrote:
         | I mean people are also trained on GPL code and I bet you can
         | find a ton of functions copies from GPL projects in million
         | other projects.
         | 
         | But as long as these are tiny parts of codebase (which will
         | most probably be the case), I doubt anything can be done with
         | that. No one will go to court because of a few generic
         | functions.
        
       | Heleana wrote:
       | Is there any way that users will be able to tell the difference
       | between this and GitHub's CoPilot?
        
         | bardia95 wrote:
         | Ghostwriter also includes 2 features that Copilot does not
         | have: Transform Code (translate between langs/refactor code),
         | Generate Code (prompt Ghostwriter to write full programs in one
         | shot)
         | 
         | Plus, Ghostwriter is integrated with the Replit platform,
         | meaning you get all the benefits Replit offers as a portable
         | development environment you can take with you anywhere you go
         | and host instantly.
        
           | TOMDM wrote:
           | One could argue Copilot can translate/refactor code.
           | 
           | I can't count the number of times I've copy pasted a chunk,
           | commented it out, then put a comment describing what I'm
           | after. Copilot will get it exactly right ~40% of the time,
           | ~40% of the time it gives me a good starting place, and the
           | other 20% I just scrap.
        
       | VWWHFSfQ wrote:
       | Does anybody know the status of the legal action against that kid
       | that supposedly stole all their good ideas and made a hobby
       | project out of it? I would like to know how that resolved before
       | I consider anything from this company again.
        
         | Jtsummers wrote:
         | https://intuitiveexplanations.com/tech/replit/#how-did-repli...
         | 
         | At the bottom of the original blog post. He put his project
         | back online and it is still up: https://riju.codes/
        
           | bdn_ wrote:
           | Even if this was settled, I consider this quote when thinking
           | about Amjad Masad (Replit CEO).
           | 
           | > When someone shows you who they are, believe them the first
           | time.
           | 
           | - Maya Angelou
        
       | NicoleJO wrote:
       | A couple of questions here asked about the intellectual property
       | rights. Has an answer been provided?
        
       | machinekob wrote:
       | "Any interns on that team who are going to get bullied by replits
       | lawyers and CEO in the near future" for developing this?
        
       | sdevonoes wrote:
       | Who's the real target audience of these kind of tools?
       | 
       | - Developers who work at a company (e.g., as employee) and need
       | to spit out features every sprint? Velocity is important, so I
       | imagine these kind of developers need to squeeze every minute
       | they are in front of the screen in order to produce working code?
       | 
       | - Developers who think of written code as one way to solve (tech)
       | problems, so they don't really care much about the process of
       | creating code, but mainly about the output (i.e., does the
       | running program solves the issue at hand?)
       | 
       | - Senior developers who don't like to write boilerplate code?
       | 
       | I don't see myself as the target audience of Copilot or
       | Ghostwriter. I do work as an employee, but I'm not a "feature
       | machine". Usually the hardest part about my job is solving
       | problems while communicating with other people. I don't need to
       | write code "fast", and by the time I hit the keyboard to start
       | coding, I don't really need that much help (granted, I'm not
       | working on code that goes into space rockets... just normal
       | e-commerce stuff)
       | 
       | I like to work on side projects and learn new technologies. When
       | I was starting with programming, as part of the learning I liked
       | to write boilerplate code (actually, that's how I learnt
       | programming. I remember writing C boilerplate code by reading
       | "The C Programming Language". Skipping the "boring" parts
       | wouldn't have helped me in my learning).
       | 
       | If any, Copilot and similar tools take away all the joy of
       | actually writing code (because, when I work on side projects, 50%
       | of the satisfaction comes from actually writing code for the sake
       | of writing code. The other 50% comes from the ability to solve a
       | problem). So, yeah, maybe for the people like me who does find
       | the act of writing code for the sake of writing code (you know
       | like painting or taking photographs), Copilot seems like an
       | unneeded tool?
        
         | minraws wrote:
         | I would like to give some context, still not onboard with these
         | tools but we have a lot of chore like work of adding some very
         | similar things but with some changes, complicated or otherwise.
         | 
         | So we have been considering using Codex or something for
         | generating the code in a more streamlined version, the key
         | reason of it being a benefit is we are a small team with each
         | person owning more than one large repositories. It's gotten
         | very annoying and our pace is far slower than what we would
         | like, here something like this makes quite a lot of sense.
         | 
         | Though the problem with such specific tools is they can't
         | generate any customized code for our codebase, we can finetune
         | other codegen models and that's what we plan to do down the
         | line, but this specific tool just not really useful if it can't
         | specialized for our codebase.
        
           | sdevonoes wrote:
           | So, does your team then spend considerable amount of time
           | writing boilerplate/chore code? Isn't that a sign of: "Hey,
           | we actually need to improve our code base guys!". I don't
           | know, if your solution to "I don't want to write chore code"
           | is "let's use Copilot to do the boring stuff"... well, I have
           | bad news for you: "chore code" needs to be maintained and/or
           | fixed, and I don't think Copilot maintains code (for now...
           | :D)
        
             | minraws wrote:
             | yeah we work on linters, tooling and the like highly
             | specific single page, highly similar code.
             | 
             | you can't get around it. We have pretty low boilerplate in
             | all the codebases I happen to manage but the sad part is
             | there is no getting around porting of specific rules,
             | setting up better metric analysis and reporting systems and
             | such.
             | 
             | If you have been involved in programming professionally for
             | a while, you would know you just can't get around the chore
             | like works sometimes. Ofc it's not a long term goal to keep
             | going this way but we needed a solution to simplify our
             | challenges as we move on.
        
         | isoprophlex wrote:
         | I love writing code, but I don't love searching the docs for
         | the sixtieth time to find the correct combination of brackets,
         | .groupBy's and .agg calls that gets the baroque horrorshow that
         | is python's Pandas lib to wrangle some data for me.
         | 
         | See it as a better autocomplete for people who don't want to or
         | can't learn by diligently doing the boring parts.
        
         | dr_kiszonka wrote:
         | For me, not a senior dev, Copilot is useful for: discovering
         | API, generating parts of docstrings, and generating bits of
         | code that don't require too much thinking but go beyond simple
         | copy and paste. It is really quite useful and helps to keep my
         | RSI in check.
         | 
         | My primary UX issue with copilot is that it is trying too hard
         | to be helpful, often suggesting code that I don't need. You
         | also can't trust it with more complex cases but that's actually
         | pretty reassuring : - )
        
       | eliseumds wrote:
       | Is a VSCode extension on the roadmap? Refactoring existing code
       | using AI looks extremely useful. Using Github Copilot I have to
       | trigger synthetise multiple times.
        
       | giansegato wrote:
       | replit employee here. the team who built this is _very_ small
       | (less than a dozen, including non-eng roles for the go to
       | market), and went from idea to general availability in 8 weeks
        
         | [deleted]
        
         | eachro wrote:
         | That's very impressive. Hats off to them! I dont think this is
         | too out of the ordinary either though. I'd guess they started
         | off with a LLM from hugging face, set up some pipeline to
         | ingest code from replit repos to finetune the LLM. The ML
         | aspect of this is not terribly hard given that they probably
         | dont need to train a LLM from scratch. Figuring out how store
         | and serve from replit repos (or publicly available code bases)
         | is not too difficult. From there it's a matter of
         | productionalizing: how to serve the model in real time,
         | figuring out they want the product to look/feel like and I
         | suppose this part of it might take a while. I'd estimate you'd
         | need 1-2 ML engineers, 2 data engineers, 2-3 swes, 1 PM for the
         | team for a minimal viable product.
        
           | nerdponx wrote:
           | 8 weeks is impressive for something like that, and it goes to
           | show just how powerful our off-the-shelf tools have become.
           | 
           | I think it's also a bit scary, because 8 weeks is very little
           | time for testing, tuning, and validation of something as
           | opaque as a machine learning model. If it worked right the
           | first time, that's great. But there is still a lot of
           | inherent uncertainty in ML projects. Decision makers need to
           | take that uncertainty into account when planning.
           | 
           | That, or, the 8 weeks only covers the final training runs and
           | the implementation/deployment, and doesn't include time spent
           | developing and tuning proof-of-concept prototype models.
        
             | mradek wrote:
             | In 2022 you test live in production lol
        
           | giansegato wrote:
           | yep, true! however, the devil is in the details. from what
           | i've been told, the big challenge was latency: they worked a
           | lot to bring the latency down to acceptable levels -
           | essentially to be usable in a cloud IDE
           | 
           | iirc the team managed to bring it to a lever an order of
           | magnitude lower than off-the-shelf models
        
         | caprock wrote:
         | That's really neat to hear. Can you comment on how replit has
         | managed to foster a culture of fast delivery? Are there any
         | interesting trade offs?
        
         | tephra wrote:
         | Curious, how large are teams in Replot usually?
         | 
         | To me (programmer in Sweden) the largest single team I've been
         | on was 14 people and that was _very_ large (indeed the largest
         | in the tech department). We actually broke ourselves up into
         | two more informal groups since we thought that was a more
         | manageable team size.
        
           | dbish wrote:
           | Neat feature but yeah very small doesn't seem like < 12 to me
           | either (worked at big tech for a while). A two pizza team
           | (standard amazon size) is 8-10, 12 starts to be on the larger
           | size for a single team, but not abnormal. Very small to me
           | would be if a team of 2-4 shipped it. Replit must be much
           | larger then I expected for a startup.
        
         | sithlord wrote:
         | any interns on that team who are going to get bullied by
         | replits lawyers in the near future?
        
           | cercatrova wrote:
           | Context: https://news.ycombinator.com/item?id=27424195
        
           | googlryas wrote:
           | Interesting, sithlord is an anagram for shitlord. While the
           | behavior of the CEO wasn't cool, the issue seems to have been
           | resolved between all involved parties and everyone has moved
           | on - we don't need to bring it up every time repl.it is
           | mentioned.
        
             | notwhereyouare wrote:
             | I'll admit, every time I hear of repl.it mentioned, I think
             | of the time the CEO threatened the intern. The CEO did a
             | huge disservice to himself and the company that day in my
             | mind
        
               | googlryas wrote:
               | You're allowed to think that. My point is really about
               | littering unrelated posts regarding repl.it with snipes
               | about it.
        
               | still_grokking wrote:
               | Oh, internet drama. I love internet drama! So I looked it
               | up as I never heard this story before.
               | 
               | https://intuitiveexplanations.com/tech/replit/
               | 
               | Looks like this CEO isn't of good character after all. He
               | looks almost like a jerk when looking at the end of the
               | story. Even in his last email he tried to get his
               | (obviously wrong) point. He never apologized for the
               | things that mattered most, only tried to extinguish the
               | social media fire all in all.
               | 
               | Also he doesn't look very smart, imho:
               | 
               | https://amasad.me/meta
               | 
               | Big LOL here! The abstract things are the simplest, yeah!
               | That's why progress in something like math or theoretical
               | physics is made by the dumbest people, in contrast to
               | something like sociology where you need genius level of
               | intelligence to come up with some new ideas. Sure, sure.
               | 
               | But that's of course not everything this dude got
               | completely backwards.
               | 
               | Would explain why replit is the most useless of all the
               | online IDEs: It has no direction, no true value
               | proposition. It's not a good cloud coding environment. It
               | never was a good code snippet playground (actually one of
               | the worsts). Now they even require accounts, so the quick
               | code snippet aspect is also gone. Also they badly
               | positioned in the education space...
               | 
               | Of course I wish them luck!
               | 
               | But I guess they have no chance against something like
               | Gitpod, Github, or OpenShift codespaces, which are light-
               | years ahead.
               | 
               | OK, maybe the exit-strategy is "just" to be visible
               | enough that at some point they get bought by one of the
               | above. (Which doesn't look like the most ethical thing to
               | do ;-)).
        
             | selykg wrote:
             | This is the type of thing where goodwill is burned and it
             | takes time to earn it back. I don't think we just brush it
             | under a rug either. In my opinion, you don't just get to
             | "resolve it" and then everyone forgets about it. For me,
             | future decisions and importantly, actions, will help me
             | personally move past this and "move on" as you say.
        
               | googlryas wrote:
               | Ok, sounds good about it taking time - assuming perfect
               | behavior, how long will it be before you stop referencing
               | the affair whenever an unrelated repl.it story comes up?
        
               | selykg wrote:
               | Can I ask why you're so defensive about it?
               | 
               | I feel like, if there's ANYTHING we have learned in the
               | past decade or two it's that people who defend a company
               | tend to be doing so for the wrong reasons. See Sony or
               | Microsoft, or Apple or Android, etc. Defending a company
               | is just weird.
               | 
               | I look at replit as a tool, run by people. The tool might
               | be cool, but the CEO made a bad decision and now I judge
               | the product on that CEOs actions. There's no definitive
               | time frame or action that just magically makes it better.
               | 
               | But in general, I'll stop thinking about the stupid
               | actions of the CEO when my brain stops reminding me "Oh,
               | no matter how cool this is, the actions of the CEO were
               | incredibly poor." When will that be? No idea, but maybe
               | sometime down the road he does enough good things that I
               | will suddenly stop and think "cool, looking back, he's
               | done enough good that I can probably forget about the
               | poor decision he made and start looking at this again,
               | because he's proven he isn't that one stupid action."
               | 
               | Goodwill is earned, it's not simply given. It's often
               | hard won, but incredibly easy to lose.
        
               | ChrisKnott wrote:
               | It doesn't seem healthy to care more about this than the
               | people actually involved
        
               | selykg wrote:
               | The CEO's actions are a reflection of the company. I'm
               | not sure I "care more about it" than I am simply aware of
               | their past actions when making decisions on whether to
               | use their product or not.
        
             | mr_cyborg wrote:
             | To be fair, since then I've heard of them ghosting people
             | after final round interviews and meeting with the CEO. It's
             | a pattern at this point.
        
         | swyx wrote:
         | very cool :)
         | 
         | can i get a clarification - when it says "in-browser" i hear
         | "on-device" as in it doesnt call back to replit to get the
         | predictions. i assume that's inaccurate?
         | 
         | for cost/compute purposes i'm wondering how small models have
         | to get in order to run "truly in browser"
        
       | easrng wrote:
       | Is this in-browser as in running the model in the browser or is
       | the model running on the server? (I assume it's on the server for
       | size and people-not-ripping-off-your-product reasons, but
       | actually running in the browser would be cool and it doesn't look
       | like it's specified.)
        
         | ricopags wrote:
         | I am not certain, but my surmise is if in-browser GPT-3
         | inference was possible it would've made the news. Seems likely
         | to be an API call.
        
       | krono wrote:
       | > trained on publicly available code
       | 
       | Fully respecting the licenses this code was published under, one
       | would hope?
        
         | IshKebab wrote:
         | I don't think anyone seriously thinks that is required. The
         | real issue is that these models can reproduce code they've been
         | trained with and then you _do_ need to be aware of the license.
         | That would be fine except as far as I know none of the existing
         | solutions warn you that the code they 've produced is the same
         | or very similar to copyrighted code they learnt it from.
         | 
         | That's the main difference from a human learning from
         | copyrighted code (which is totally legal). If they have a good
         | memory they might be able to reproduce copyrighted snippets,
         | but they would usually (probably not always!) know they are
         | doing that.
        
         | googlryas wrote:
         | What do source code licenses say about using source code as
         | training data? IANAL, I would imagine it's only relevant if the
         | model spits out already existing licensed code, and that using
         | the code as training data is largely irrelevant.
         | 
         | For a simpler example than code-generating ML, if I write a
         | program to recognize a directory of source code vs non-source
         | code, and my logic is if (unbalanced parenthesis count in all
         | files > X) { return "not source code"; } else { return "source
         | code"}.
         | 
         | And then I compute X by scanning over the linux kernel source
         | and counting the amount of unbalanced parens, have I just
         | committed a GPL violation if I don't GPL my source code
         | recognizer?
        
           | serf wrote:
           | I'm not sure if the training data would constitute an
           | aggregation -- given the usually non-reversible nature of it
           | -- but I found this.
           | 
           | "Where's the line between two separate programs, and one
           | program with two parts? This is a legal question, which
           | ultimately judges will decide. "[0]
           | 
           | [0]: https://www.gnu.org/licenses/gpl-
           | faq.en.html#MereAggregation
        
           | krono wrote:
           | We're talking large scale commercial repurposing of source
           | code with worldwide redistribution here. Not some project you
           | whipped up in 5 minutes to learn from, or automate some minor
           | annoying task.
           | 
           | Unlicensed source code - the default - is still protected by
           | copyright law. If it's hosted and served from a different
           | jurisdiction where no exception exists for training data
           | models.
           | 
           | Then there are also licenses that explicitly prohibit
           | commercial usage to consider.
           | 
           | What it comes down to, as it always does, is that a small
           | group of (practically) untouchable people are making money by
           | abusing and thereby irreparably damaging the trust and good
           | will of the collective.
           | 
           | It's a complex topic eh
        
         | swyx wrote:
         | _crickets_
        
         | [deleted]
        
       | iandanforth wrote:
       | Things I'd like to know about this tool:
       | 
       | - What areas/languages/tasks is it good at and what is it bad at?
       | 
       | - How often is it generating code with bugs?
       | 
       | - How often is the code that gets generated used _as is_ vs
       | immediately edited?
       | 
       | - What are the _new_ frustrations that this causes that existing
       | IDE code completion doesn 't?
       | 
       | I work with trained models daily and I know that their failure
       | cases are unintuitive, unexpected, and exasperating. I'd like to
       | know as much about the failure cases of _this_ model as possible
       | before diving in.
        
         | bardia95 wrote:
         | Replit team member here:
         | 
         | - Ghostwriter is especially good at Python and JS, but supports
         | up to 16 languages (to varying degrees of effectiveness) you
         | can read more here:
         | https://docs.replit.com/ghostwriter/faq#which-programming-la...
         | 
         | - As for tasks, it's great for reducing the amount of
         | boilerplate code you need to write (Complete Code), writing
         | React components (Complete + Generate Code), explaining code in
         | plain English (Explain Code), translating code between
         | languages (Transform Code), writing exhaustive tests (Complete
         | Code)
         | 
         | - No stats on how often its generating code with bugs or how
         | often the code gets generated used as is vs immediately edited,
         | we're interested in getting both to help improve Ghostwriter
         | 
         | - Like any LLM, it can get stuck in a long-tail of repetitive
         | loops; we're working pretty hard to improve and mitigate these
         | issues, but, especially for new users, the repetition and
         | hallucination type problems can be distracting.
        
           | [deleted]
        
       | three_seagrass wrote:
       | Love using Replit but is there any way to try this before buying
       | without having to wait weeks on a waitlist?
        
         | bardia95 wrote:
         | If you sign up for the waitlist today, you should be able to
         | get access within a few days.
        
       ___________________________________________________________________
       (page generated 2022-10-31 23:00 UTC)