[HN Gopher] Replit's in-browser coding AI ___________________________________________________________________ Replit's in-browser coding AI Author : bauerpl Score : 121 points Date : 2022-10-31 15:29 UTC (7 hours ago) (HTM) web link (replit.com) (TXT) w3m dump (replit.com) | aryamaan wrote: | I was on beta list used this for golang. | | It blew my mind half of the times. It was like it knew what I was | going to do. | | The other times it was dumber than the standard auto complete. It | doesn't have any awareness of already defined variables and | doesn't use them to complete halfwritten variables. Hope this | gets better soon. | _whiteCaps_ wrote: | Have you tried Copilot? That seemed to work very well for me. | aryamaan wrote: | Will give a try. I like Replit's ready to use key-value db | and also, a click deployment (running is deployment). | | Overall, I like my Jetbrains IDE but Replit is coming with | appealing features for the side projects specially. Easy to | use auth, db, analytics, deployment... | | I wish either Replit levels up its IDE game or Jetbrains(or | community) build plugins to match the state of art/joyful | experience of programming. | make3 wrote: | how is it better than copilot? | luxurytent wrote: | Does this differ from GitHub's CoPilot in any user-noticeable | way? (outside of the platform it's available on) | Liquid_Fire wrote: | One thing I noticed is it seems to have an "Explain code" | feature, giving you a textual explanation of a block of code | you select, which I'm not aware of GitHub having. | pynappo wrote: | GitHub Copilot Labs (a VSCode extension meant to be installed | with GitHub Copilot) seems to have an explain code feature. | | ref: https://github.blog/2022-09-14-8-things-you-didnt-know- | you-c... | poulpy123 wrote: | TBH the example wasn't convincing at all | WithinReason wrote: | You can just use the prompt "The above code does the | following explained in English" in a comment to prompt | CoPilot to explain its code. You could probably also engineer | a prompt to translate code between languages. | maep wrote: | If this tool was trained on open source code, what license does | the generated code have? At least with Codepilot people were able | to generate verbatim GPL code with typos and everything. More | importantly, I wonder if companies behind these type of tools | offer legal or financial protections in case GPL code sneaks in | and leads to expensive law suits. | deworms wrote: | No they weren't able to generate the same existing code, both | because that code is not included anywhere in the model, and | because Copilot (not "Codepilot") has safeguards against this | kind of situation, should it arise in the highly unlikely | situation that a snippet is repeated thousands of times across | thousands of repositories. | | I've gotta let you know that people copy code snippets from all | sorts of codebases with little regard for licenses anyway, | because they're toothless in 99% of cases, AI or not. It's a | nice illusion that anyone respects licenses, but it's just not | true. | NicoleJO wrote: | That's incorrect. CoPilot steals verbatim. Examples: | https://justoutsourcing.blogspot.com/2022/03/gpts- | plagiarism... | hnusersarelame wrote: | maep wrote: | I've spent hours looking over code before delivering to | FAANG. Our company had put a clause into the contract that | our code was free of any GPL'd code. It happened before and | it was discoved. The whole thing was a very expensive | excersice. I'm aware that many small startups, 90% of which | go bust anyways, just ignore licenses but that doesn't work | when you play with the big boys. | m00x wrote: | If you look at licensed code, then write new code, do you also | bring in those licenses? | | It's been proved in court that AI does not infringe on | copyright or licenses since it generates things from an | understanding of the whole, instead of directly stealing, just | like the human brain does. | mr_toad wrote: | > If you look at licensed code, then write new code, do you | also bring in those licenses? | | If the "new" code is close enough to be considered a derived | work then you will need a license. | MuffinFlavored wrote: | > If the "new" code is close enough to be considered a | derived work then you will need a license. | | And how is that determined... in court at trial? By an | unbiased 3rd party competent enough to understand both | codebases? | mtlynch wrote: | > _It 's been proved in court that AI does not infringe on | copyright or licenses since it generates things from an | understanding of the whole, instead of directly stealing, | just like the human brain does._ | | Do you have a source for that? | | This SF Conservancy article[0] says that's not true: | | > _Consider GitHub's claim that "training ML systems on | public data is fair use". We have not found any case of note | -- at least in the USA -- that truly contemplates that | question._ | | The first major court case I know about is the class-action | case Matthew Butterick is trying to build.[1] | | [0] https://sfconservancy.org/blog/2022/feb/03/github- | copilot-co... | | [1] https://githubcopilotinvestigation.com/ | krono wrote: | US Court rulings do not automatically apply worldwide, and | not everything it would apply to exists within its | jurisdiction. | Barrin92 wrote: | astonishingly enough every sentence in this post is untrue. | There's been no court case on any of the models in question | here. They don't work like human brains, nor understand | anything they output. Even if they did of course that output | would still be subject to licenses, given that human code is | subject to them, which is why those licenses exist in the | first place. | | If you ever plan to steal someone's code and justify it with | "my brain is able to learn, therefore copyright doesn't | exist" I warn you right now this will not fly. | risyachka wrote: | I mean people are also trained on GPL code and I bet you can | find a ton of functions copies from GPL projects in million | other projects. | | But as long as these are tiny parts of codebase (which will | most probably be the case), I doubt anything can be done with | that. No one will go to court because of a few generic | functions. | Heleana wrote: | Is there any way that users will be able to tell the difference | between this and GitHub's CoPilot? | bardia95 wrote: | Ghostwriter also includes 2 features that Copilot does not | have: Transform Code (translate between langs/refactor code), | Generate Code (prompt Ghostwriter to write full programs in one | shot) | | Plus, Ghostwriter is integrated with the Replit platform, | meaning you get all the benefits Replit offers as a portable | development environment you can take with you anywhere you go | and host instantly. | TOMDM wrote: | One could argue Copilot can translate/refactor code. | | I can't count the number of times I've copy pasted a chunk, | commented it out, then put a comment describing what I'm | after. Copilot will get it exactly right ~40% of the time, | ~40% of the time it gives me a good starting place, and the | other 20% I just scrap. | VWWHFSfQ wrote: | Does anybody know the status of the legal action against that kid | that supposedly stole all their good ideas and made a hobby | project out of it? I would like to know how that resolved before | I consider anything from this company again. | Jtsummers wrote: | https://intuitiveexplanations.com/tech/replit/#how-did-repli... | | At the bottom of the original blog post. He put his project | back online and it is still up: https://riju.codes/ | bdn_ wrote: | Even if this was settled, I consider this quote when thinking | about Amjad Masad (Replit CEO). | | > When someone shows you who they are, believe them the first | time. | | - Maya Angelou | NicoleJO wrote: | A couple of questions here asked about the intellectual property | rights. Has an answer been provided? | machinekob wrote: | "Any interns on that team who are going to get bullied by replits | lawyers and CEO in the near future" for developing this? | sdevonoes wrote: | Who's the real target audience of these kind of tools? | | - Developers who work at a company (e.g., as employee) and need | to spit out features every sprint? Velocity is important, so I | imagine these kind of developers need to squeeze every minute | they are in front of the screen in order to produce working code? | | - Developers who think of written code as one way to solve (tech) | problems, so they don't really care much about the process of | creating code, but mainly about the output (i.e., does the | running program solves the issue at hand?) | | - Senior developers who don't like to write boilerplate code? | | I don't see myself as the target audience of Copilot or | Ghostwriter. I do work as an employee, but I'm not a "feature | machine". Usually the hardest part about my job is solving | problems while communicating with other people. I don't need to | write code "fast", and by the time I hit the keyboard to start | coding, I don't really need that much help (granted, I'm not | working on code that goes into space rockets... just normal | e-commerce stuff) | | I like to work on side projects and learn new technologies. When | I was starting with programming, as part of the learning I liked | to write boilerplate code (actually, that's how I learnt | programming. I remember writing C boilerplate code by reading | "The C Programming Language". Skipping the "boring" parts | wouldn't have helped me in my learning). | | If any, Copilot and similar tools take away all the joy of | actually writing code (because, when I work on side projects, 50% | of the satisfaction comes from actually writing code for the sake | of writing code. The other 50% comes from the ability to solve a | problem). So, yeah, maybe for the people like me who does find | the act of writing code for the sake of writing code (you know | like painting or taking photographs), Copilot seems like an | unneeded tool? | minraws wrote: | I would like to give some context, still not onboard with these | tools but we have a lot of chore like work of adding some very | similar things but with some changes, complicated or otherwise. | | So we have been considering using Codex or something for | generating the code in a more streamlined version, the key | reason of it being a benefit is we are a small team with each | person owning more than one large repositories. It's gotten | very annoying and our pace is far slower than what we would | like, here something like this makes quite a lot of sense. | | Though the problem with such specific tools is they can't | generate any customized code for our codebase, we can finetune | other codegen models and that's what we plan to do down the | line, but this specific tool just not really useful if it can't | specialized for our codebase. | sdevonoes wrote: | So, does your team then spend considerable amount of time | writing boilerplate/chore code? Isn't that a sign of: "Hey, | we actually need to improve our code base guys!". I don't | know, if your solution to "I don't want to write chore code" | is "let's use Copilot to do the boring stuff"... well, I have | bad news for you: "chore code" needs to be maintained and/or | fixed, and I don't think Copilot maintains code (for now... | :D) | minraws wrote: | yeah we work on linters, tooling and the like highly | specific single page, highly similar code. | | you can't get around it. We have pretty low boilerplate in | all the codebases I happen to manage but the sad part is | there is no getting around porting of specific rules, | setting up better metric analysis and reporting systems and | such. | | If you have been involved in programming professionally for | a while, you would know you just can't get around the chore | like works sometimes. Ofc it's not a long term goal to keep | going this way but we needed a solution to simplify our | challenges as we move on. | isoprophlex wrote: | I love writing code, but I don't love searching the docs for | the sixtieth time to find the correct combination of brackets, | .groupBy's and .agg calls that gets the baroque horrorshow that | is python's Pandas lib to wrangle some data for me. | | See it as a better autocomplete for people who don't want to or | can't learn by diligently doing the boring parts. | dr_kiszonka wrote: | For me, not a senior dev, Copilot is useful for: discovering | API, generating parts of docstrings, and generating bits of | code that don't require too much thinking but go beyond simple | copy and paste. It is really quite useful and helps to keep my | RSI in check. | | My primary UX issue with copilot is that it is trying too hard | to be helpful, often suggesting code that I don't need. You | also can't trust it with more complex cases but that's actually | pretty reassuring : - ) | eliseumds wrote: | Is a VSCode extension on the roadmap? Refactoring existing code | using AI looks extremely useful. Using Github Copilot I have to | trigger synthetise multiple times. | giansegato wrote: | replit employee here. the team who built this is _very_ small | (less than a dozen, including non-eng roles for the go to | market), and went from idea to general availability in 8 weeks | [deleted] | eachro wrote: | That's very impressive. Hats off to them! I dont think this is | too out of the ordinary either though. I'd guess they started | off with a LLM from hugging face, set up some pipeline to | ingest code from replit repos to finetune the LLM. The ML | aspect of this is not terribly hard given that they probably | dont need to train a LLM from scratch. Figuring out how store | and serve from replit repos (or publicly available code bases) | is not too difficult. From there it's a matter of | productionalizing: how to serve the model in real time, | figuring out they want the product to look/feel like and I | suppose this part of it might take a while. I'd estimate you'd | need 1-2 ML engineers, 2 data engineers, 2-3 swes, 1 PM for the | team for a minimal viable product. | nerdponx wrote: | 8 weeks is impressive for something like that, and it goes to | show just how powerful our off-the-shelf tools have become. | | I think it's also a bit scary, because 8 weeks is very little | time for testing, tuning, and validation of something as | opaque as a machine learning model. If it worked right the | first time, that's great. But there is still a lot of | inherent uncertainty in ML projects. Decision makers need to | take that uncertainty into account when planning. | | That, or, the 8 weeks only covers the final training runs and | the implementation/deployment, and doesn't include time spent | developing and tuning proof-of-concept prototype models. | mradek wrote: | In 2022 you test live in production lol | giansegato wrote: | yep, true! however, the devil is in the details. from what | i've been told, the big challenge was latency: they worked a | lot to bring the latency down to acceptable levels - | essentially to be usable in a cloud IDE | | iirc the team managed to bring it to a lever an order of | magnitude lower than off-the-shelf models | caprock wrote: | That's really neat to hear. Can you comment on how replit has | managed to foster a culture of fast delivery? Are there any | interesting trade offs? | tephra wrote: | Curious, how large are teams in Replot usually? | | To me (programmer in Sweden) the largest single team I've been | on was 14 people and that was _very_ large (indeed the largest | in the tech department). We actually broke ourselves up into | two more informal groups since we thought that was a more | manageable team size. | dbish wrote: | Neat feature but yeah very small doesn't seem like < 12 to me | either (worked at big tech for a while). A two pizza team | (standard amazon size) is 8-10, 12 starts to be on the larger | size for a single team, but not abnormal. Very small to me | would be if a team of 2-4 shipped it. Replit must be much | larger then I expected for a startup. | sithlord wrote: | any interns on that team who are going to get bullied by | replits lawyers in the near future? | cercatrova wrote: | Context: https://news.ycombinator.com/item?id=27424195 | googlryas wrote: | Interesting, sithlord is an anagram for shitlord. While the | behavior of the CEO wasn't cool, the issue seems to have been | resolved between all involved parties and everyone has moved | on - we don't need to bring it up every time repl.it is | mentioned. | notwhereyouare wrote: | I'll admit, every time I hear of repl.it mentioned, I think | of the time the CEO threatened the intern. The CEO did a | huge disservice to himself and the company that day in my | mind | googlryas wrote: | You're allowed to think that. My point is really about | littering unrelated posts regarding repl.it with snipes | about it. | still_grokking wrote: | Oh, internet drama. I love internet drama! So I looked it | up as I never heard this story before. | | https://intuitiveexplanations.com/tech/replit/ | | Looks like this CEO isn't of good character after all. He | looks almost like a jerk when looking at the end of the | story. Even in his last email he tried to get his | (obviously wrong) point. He never apologized for the | things that mattered most, only tried to extinguish the | social media fire all in all. | | Also he doesn't look very smart, imho: | | https://amasad.me/meta | | Big LOL here! The abstract things are the simplest, yeah! | That's why progress in something like math or theoretical | physics is made by the dumbest people, in contrast to | something like sociology where you need genius level of | intelligence to come up with some new ideas. Sure, sure. | | But that's of course not everything this dude got | completely backwards. | | Would explain why replit is the most useless of all the | online IDEs: It has no direction, no true value | proposition. It's not a good cloud coding environment. It | never was a good code snippet playground (actually one of | the worsts). Now they even require accounts, so the quick | code snippet aspect is also gone. Also they badly | positioned in the education space... | | Of course I wish them luck! | | But I guess they have no chance against something like | Gitpod, Github, or OpenShift codespaces, which are light- | years ahead. | | OK, maybe the exit-strategy is "just" to be visible | enough that at some point they get bought by one of the | above. (Which doesn't look like the most ethical thing to | do ;-)). | selykg wrote: | This is the type of thing where goodwill is burned and it | takes time to earn it back. I don't think we just brush it | under a rug either. In my opinion, you don't just get to | "resolve it" and then everyone forgets about it. For me, | future decisions and importantly, actions, will help me | personally move past this and "move on" as you say. | googlryas wrote: | Ok, sounds good about it taking time - assuming perfect | behavior, how long will it be before you stop referencing | the affair whenever an unrelated repl.it story comes up? | selykg wrote: | Can I ask why you're so defensive about it? | | I feel like, if there's ANYTHING we have learned in the | past decade or two it's that people who defend a company | tend to be doing so for the wrong reasons. See Sony or | Microsoft, or Apple or Android, etc. Defending a company | is just weird. | | I look at replit as a tool, run by people. The tool might | be cool, but the CEO made a bad decision and now I judge | the product on that CEOs actions. There's no definitive | time frame or action that just magically makes it better. | | But in general, I'll stop thinking about the stupid | actions of the CEO when my brain stops reminding me "Oh, | no matter how cool this is, the actions of the CEO were | incredibly poor." When will that be? No idea, but maybe | sometime down the road he does enough good things that I | will suddenly stop and think "cool, looking back, he's | done enough good that I can probably forget about the | poor decision he made and start looking at this again, | because he's proven he isn't that one stupid action." | | Goodwill is earned, it's not simply given. It's often | hard won, but incredibly easy to lose. | ChrisKnott wrote: | It doesn't seem healthy to care more about this than the | people actually involved | selykg wrote: | The CEO's actions are a reflection of the company. I'm | not sure I "care more about it" than I am simply aware of | their past actions when making decisions on whether to | use their product or not. | mr_cyborg wrote: | To be fair, since then I've heard of them ghosting people | after final round interviews and meeting with the CEO. It's | a pattern at this point. | swyx wrote: | very cool :) | | can i get a clarification - when it says "in-browser" i hear | "on-device" as in it doesnt call back to replit to get the | predictions. i assume that's inaccurate? | | for cost/compute purposes i'm wondering how small models have | to get in order to run "truly in browser" | easrng wrote: | Is this in-browser as in running the model in the browser or is | the model running on the server? (I assume it's on the server for | size and people-not-ripping-off-your-product reasons, but | actually running in the browser would be cool and it doesn't look | like it's specified.) | ricopags wrote: | I am not certain, but my surmise is if in-browser GPT-3 | inference was possible it would've made the news. Seems likely | to be an API call. | krono wrote: | > trained on publicly available code | | Fully respecting the licenses this code was published under, one | would hope? | IshKebab wrote: | I don't think anyone seriously thinks that is required. The | real issue is that these models can reproduce code they've been | trained with and then you _do_ need to be aware of the license. | That would be fine except as far as I know none of the existing | solutions warn you that the code they 've produced is the same | or very similar to copyrighted code they learnt it from. | | That's the main difference from a human learning from | copyrighted code (which is totally legal). If they have a good | memory they might be able to reproduce copyrighted snippets, | but they would usually (probably not always!) know they are | doing that. | googlryas wrote: | What do source code licenses say about using source code as | training data? IANAL, I would imagine it's only relevant if the | model spits out already existing licensed code, and that using | the code as training data is largely irrelevant. | | For a simpler example than code-generating ML, if I write a | program to recognize a directory of source code vs non-source | code, and my logic is if (unbalanced parenthesis count in all | files > X) { return "not source code"; } else { return "source | code"}. | | And then I compute X by scanning over the linux kernel source | and counting the amount of unbalanced parens, have I just | committed a GPL violation if I don't GPL my source code | recognizer? | serf wrote: | I'm not sure if the training data would constitute an | aggregation -- given the usually non-reversible nature of it | -- but I found this. | | "Where's the line between two separate programs, and one | program with two parts? This is a legal question, which | ultimately judges will decide. "[0] | | [0]: https://www.gnu.org/licenses/gpl- | faq.en.html#MereAggregation | krono wrote: | We're talking large scale commercial repurposing of source | code with worldwide redistribution here. Not some project you | whipped up in 5 minutes to learn from, or automate some minor | annoying task. | | Unlicensed source code - the default - is still protected by | copyright law. If it's hosted and served from a different | jurisdiction where no exception exists for training data | models. | | Then there are also licenses that explicitly prohibit | commercial usage to consider. | | What it comes down to, as it always does, is that a small | group of (practically) untouchable people are making money by | abusing and thereby irreparably damaging the trust and good | will of the collective. | | It's a complex topic eh | swyx wrote: | _crickets_ | [deleted] | iandanforth wrote: | Things I'd like to know about this tool: | | - What areas/languages/tasks is it good at and what is it bad at? | | - How often is it generating code with bugs? | | - How often is the code that gets generated used _as is_ vs | immediately edited? | | - What are the _new_ frustrations that this causes that existing | IDE code completion doesn 't? | | I work with trained models daily and I know that their failure | cases are unintuitive, unexpected, and exasperating. I'd like to | know as much about the failure cases of _this_ model as possible | before diving in. | bardia95 wrote: | Replit team member here: | | - Ghostwriter is especially good at Python and JS, but supports | up to 16 languages (to varying degrees of effectiveness) you | can read more here: | https://docs.replit.com/ghostwriter/faq#which-programming-la... | | - As for tasks, it's great for reducing the amount of | boilerplate code you need to write (Complete Code), writing | React components (Complete + Generate Code), explaining code in | plain English (Explain Code), translating code between | languages (Transform Code), writing exhaustive tests (Complete | Code) | | - No stats on how often its generating code with bugs or how | often the code gets generated used as is vs immediately edited, | we're interested in getting both to help improve Ghostwriter | | - Like any LLM, it can get stuck in a long-tail of repetitive | loops; we're working pretty hard to improve and mitigate these | issues, but, especially for new users, the repetition and | hallucination type problems can be distracting. | [deleted] | three_seagrass wrote: | Love using Replit but is there any way to try this before buying | without having to wait weeks on a waitlist? | bardia95 wrote: | If you sign up for the waitlist today, you should be able to | get access within a few days. ___________________________________________________________________ (page generated 2022-10-31 23:00 UTC)