[HN Gopher] An example of LLM prompting for programming ___________________________________________________________________ An example of LLM prompting for programming Author : mpweiher Score : 398 points Date : 2023-04-18 11:21 UTC (11 hours ago) (HTM) web link (martinfowler.com) (TXT) w3m dump (martinfowler.com) | paphillips wrote: | One initial reaction to the prompting style is how similar it is | to a human-to-human interaction. For example, a team lead | communicating requirements to a wider team composed of less | experienced engineers may also follow this type of iterative | exchange, continuing until he or she is satisfied that the team | understands the work to be done and has the guide rails to be | successful. | | I recently heard a description about the way this technology will | change technical work that resonated: we will become more like | the movie director, and less like the actors. | dpflan wrote: | This got me wondering about best techniques for integrating LLM | code assistants into day-to-day software development, and hence | Ask HN: What is your GitHub Copilot (code LLM assistant) | workflow? | | Please share your experience here: | https://news.ycombinator.com/item?id=35613576 | | I'd like to learn what is working and useful. | yoyohello13 wrote: | I feel like from an information theory perspective there is a | lower bound on how little we can write to get a sufficiently | specific spec for the AI to generate correct code. | | This example seems like almost as much work as just writing the | code myself. I think English is just too fuzzy, maybe eventually | we will get a language tailored to AI that will put more specific | limits on the meanings of works. But then how is it all that | different from Python? | blackbear_ wrote: | > a lower bound on how little we can write to get a | sufficiently specific spec for the AI to generate correct code. | | Interesting though. I believe a lower bound for the number of | bits must be at least the log (in base 2) of the probability of | such code to appear "in the wild", and larger if the training | set is biased and/or the model not fully trained | gitgud wrote: | A useful approach, but this is a tiny green field project. I'm | not so sure it would work in a large existing proprietary system, | where you shouldn't describe too much of the " _NDA protected | context_ "... | themodelplumber wrote: | To me, this is likely an area where we'll see future coders | tested: | | Interviewer: Here is a very specific project. And this part | here is NDA covered. We have provided a context prompt with all | the generals. Let's say you are new here and we need you | effective today. Show us how you'll cover the last mile with | the LLM by writing prompts that do not violate NDA but get the | needed work done. Then whiteboard for your team a prompt schema | & policy that you think will work for this project. | | I.e. a creativity exercise at the very least. You want someone | who can code _for a prompt, to solve coverage problems_, and | this is still coding. | | For now I think a lot of people will hoard this kind of prompt | info/leverage-pattern stuff when they discover it. It's not | about the individual prompts. | nice_byte wrote: | do people enjoy working this way? wasting time verbalizing your | thoughts, stating the obvious, wordsmithing to get the thing to | "understand" what you _actually_ want? | choeger wrote: | That's refinement-style programming from a novel angel, but still | clearly refinement-style. | chewbacha wrote: | I guess this is neat but I'd rather write code myself. | [deleted] | Sateeshm wrote: | Yes. I would rather write the code myself too. But it's a good | idea to use it to explore solutions or alternate | implementations | killingtime74 wrote: | It's like talking to a student or an intern. Which is not bad | normally because we are also educating them. | koheripbal wrote: | I'd rather farm all my own food, build my own house, and teach | my own kids, but I don't have infinite time each day. | nsxwolf wrote: | The prompt was almost as much work as the code, and there was | no way to write that prompt without a CS education and/or | years of development experience. | grrdotcloud wrote: | I have found that this applies to many Crafts. | | Cutting wood is easy. Simple really. Crafting an attractive | and functional chair requires discipline. Designing it? | Brilliance. | throwaway290 wrote: | If you are optimizing for an end goal rather than enjoying | the process... What is that goal? Why does it matter? | goatlover wrote: | Presumably you have time to write your own code as a | developer, since you're not being paid to be a farmer or | carpenter? | chrisco255 wrote: | > He's using a generic application example in here: one thing to | be wary of when interacting with ChatGPT and the like is that we | should never put anything that may be confidential into the | prompt, as that would be a security risk. Business rules, any | code from a real project - all these must not enter the | interaction with ChatGPT. | | Remember, when storing your business code on Github servers | hosted by Microsoft, it is important to not place real code from | a project into OpenAI servers hosted by Microsoft. That would be | a security risk. | hbn wrote: | The hosting is not the issue. Github would have different | security requirements for code hosted in a private repo for a | paying org than OpenAI would for free users sending prompts to | an LLM. It can and should be assumed anything you type into | ChatGPT is being logged to be potentially read by a human. | nbzso wrote: | Useful for form of learning and experimentation. Not applicable, | in my view, at all due to lack of ownership of the generated | code. There is no ability to copyright and protect the | intellectual output from generative AI processes. | | Even when your prompts are clearly the pseudocode which creates | the scope of the generated response. Until this situation is | legally cleared, I will be very cautious to include LLM's outside | rapid prototyping and conceptual phase. Not to mention the | madness of AutoGPT or more realistic approach of LangChain. | | It is early in the game and the Hype train is riding more rapidly | than crypto and web3 combined. | | I see a lot of AI startups introducing the same capabilities | through OpenAI API and prompts, without consideration of prompt | injection risk. So we will see who will survive. | mk89 wrote: | For me chatGPT or phind (which is based on chatGPT4, if I | understood right) are great documentation tools and also general | productivity tools, nothing to say about it. | | The main issue is that sometimes they really f** it up bad, they | make you rethink your knowledge quite deeply (do I remember | wrong? did I maybe understand this wrong? is chatGPT wrong?) and | this is for me something that can be worse than having to do it | myself, because it creates some sort of insecurity, as you always | have to challenge your self thinking, and this is not how we work | in our daily job, isn't it? At least this doesn't happen so | frequently to me - from time to time we have arguments in the | team, but this kind of "wrong information" feels more like | "hidden" traps than someone else arguing (with valid arguments, | of course). | themodelplumber wrote: | One thing that really bothers me is that I want it to use best | practices and it doesn't really know which ones I'm talking | about, and then I realize they are _my_ set of best practices, | made from others' nameless best practices. | | So I have to decide if it's just a matter of manually | converting the 5-10 little things like using `env bash` in the | header, etc. Or do I ask it to remember that and proceed to the | next layer of the project, and feel like Katamari Coder, which | is quite a feeling of what-is-this-fresh-encumbrance at times. | | There is a nascent sense that the interface is not even close | to where it needs to be to efficiently support that kind of | recall for working memory on the coder's end. | | I can definitely see a new LLM relativistic-symbolic | instruction code & IDE-equivalent (with yet-unseen | presentational and let's even say modal editing factors) being | extremely useful, which is a bit funny but also that's what | those things are good for... Right now I can scroll up through | my prompts to supplement my working memory, but that's another | place where the whole activity starts to seem very tedious. | | (Is the LLM coming for the coders, or are coders coming for the | LLM?) | dpkirchner wrote: | > Or do I ask it to remember that and proceed to the next | layer of the project | | I think this could be solved with a good browser extension. | Something that provides an easy to access (e.g., keyboard- | only) way to paste customized prompt preludes that enforce | your style (or styles if, say, you're using multiple | languages). | | It looks like Maccy could do the job, albeit not as an | extension. I haven't tried it yet. | themodelplumber wrote: | I tried one kinda like this. Setting aside the extension | feel of it, what I'd like to see is a move from prompt- | helper to pattern language for visually reporting the | process of working with the LLM, to which the LLM has | parsing access. | | So, let's say you can see your conversation as normal, but | you can also see your actual code project as a node-based | procedural design layout in an editable window. The | relevant conversation details are used to draw the nodes. | | You go to one node representing a bash script and click its | Patterns tab and search-to-type for the community pattern, | "Joe's Best Bash Practices". It's added to your quick | palette and LLM offers to add similar patterns to other | nodes in Nim and Pascal and ABS, but actually for ABS | there's a "concept" symbol that indicates it's only going | to be able to guess what you would want based on the | others. | | Then it offers to gradually teach you node-shorthand as you | edit the project, so eventually you don't need to write any | prompts, just basic shorthand syntax. Where the syntax gets | clunky, or when you buy a custom keyboard just for this | syntax but with a few gotchas, you can work together and | change syntax to fit. | | Nbdy hus lrnd shrtnd nos knda whr m gng wths. | mason55 wrote: | I think that Copilot is much better/more promising for this | kind of thing because it's looking at the code you've already | written without you having to constantly prompt it. | | I had a lot of the same hangups as you when I had played | around with ChatGPT. How do I get it to handle the monotonous | stuff without me having to spend all my time teaching it? | | I finally tried Copilot the other day and it was stunning. I | had a half-written golang client that was a wrapper around an | undocumented and poorly structured API for a tool we use. I | had written the get and create methods. Then I added a | comment with an example URL for delete and Copilot auto- | completed the entire method in the same style as the two | methods I had already written. In some cases, like formatting | & error handling, it was exactly the same as what I'd | written, but other cases, like variable naming, string | templating, etc., it replicated the spirit of my style but | adapted for this new "delete" method. | | I think ChatGPT is just the wrong interface for this kind of | thing (at least right now). | Filligree wrote: | They're complementary, I'd say. GPT-4 handles greenfield | development better; you can tell it to write a quick | script, and usually it more or less works. Copilot doesn't | do much when you're looking at a blank page. | | This would make copilot the better tool in 90% of cases, | but I've been using GPT-4 to script a lot of things I | previously would never have scripted at all. It reduces the | cost to where even one-off scripts for a twenty minute job | are usually worth writing. | rootusrootus wrote: | One thing ChatGPT (specifically, the GPT4 version) keeps doing | to me is confidently lying, and when I call it out, apologizing | and spitting out another response. Sometimes the right answer, | sometimes another wrong one (after a couple tries it then says | something like "well, I guess I don't have the right answer | after all, but here is a general description of the problem") | | Part of me laughs out loud (literally, out loud for once) when | it does that. But the other part of me is irritated at the | overconfidence. It is a potentially handy tool but keep the | real documentation handy because you'll need it. | moonchrome wrote: | Honestly to me it happens more than it doesn't - but maybe | that's because I've tried it in cases where I've already used | traditional approaches to come up with the answer and going to | GPT and phind to benchmark their viability. | | I've mentioned it on other thread, but phind's "google-fu" is | weak, it does a shallow pass and bing index (I'm assuming) is | worse than google. It's also slow as hell with GPT4 which makes | digging deeper slower than just manually going in. | isaacfrond wrote: | The article stresses to _never put anything that may be | confidential into the prompt_. Yet, chatGpt offers to out-out | from using your data for training. | | For most purposes that seems to be sufficient doesn't it? Or are | there reasons not to trust OpenAi on this one? | vharuck wrote: | I will never have full trust in an assertion unless (a) it's | included in a contract that binds all parties, (b) the same | contract includes a penalty for breaking the assertion that's | severe enough to discourage it, and (c) I know the financial | and other costs of litigation won't be severe for me. | | In short, unless my large employer will likely win in punishing | OpenAI should they break a promise, that promise is just | aspirational marketing speak. | | For data retention and usage, I'd also need a similar | contractual agreement to tie the hands of any company that | would acquire them in the future. | twelve40 wrote: | Copilot for individuals stores code snippets by default | according to their TOS. Sure, you can probably find a way to | opt out of that somewhere as well, but you'd have to read the | TOS for every plugin and service you use, find the opt-out | links and make sure you don't opt-in again via some other route | (such as not Copilot but ChatGPT proper or some other Github, | VSCode or some other plugin or service button or knob). | themodelplumber wrote: | > Or are there reasons not to trust OpenAi on this one? | | Yes, more related to general tech history and not a dig on | OpenAI though. | blowski wrote: | From a GDPR or commercial confidentiality perspective, it | doesn't matter what OpenAI say they'll do with your data, you | can't share it with them. | | Let's say your doctor enters sensitive info about you, and | despite having told OpenAI not to train data with it, they use | it anyway due to a bug. A year from now, ChatGPT is generating | personal information tells everyone and anyone about your | sensitive info. | | Would you exclusively blame ChatGPT? | dustypotato wrote: | There was a bug where chat history of some users were visible | to others | clarge1120 wrote: | > are there reasons not to trust OpenAi on this one? | | Yes, the fact that they are closed, not open, for one. And that | they switched from open to closed the moment it benefited them | to do so. | [deleted] | pcthrowaway wrote: | I've tried using ChatGPT for writing Vitest tests, and it can't | do it, full stop. | | If you look at the end, it parroted out some tests for _jest_. | True, the APIs are mostly compatible and you can probably change | that to Vitest with a couple of lines of code changed, but for | more advanced tests, that won 't necessarily work. | | Really disappointed to see this so highly upvoted, when it's pure | garbage | joshribakoff wrote: | That library doesn't even appear to have a stable release yet, | and was at v0.0.x as of a year or so ago... you also may be | using chatGPT 3.5 which may predate this library. As a dev with | 15 years of experience I haven't even switched over from jest | (but plan to)... all this to say, maybe we can give the bot | some slack here. It should be possible to include vitest docs | and examples in your prompts to teach it in context, did you | try that? | pcthrowaway wrote: | Sure, I realize it's unsuccessful at using vitest _because_ | it 's (relatively) new. | | I'm just saying, this was a really telling example of how to | use it for prompting. | | A _very large_ chunk of the tools I use in Javascript-land | are "too new" for ChatGPT to work with properly. | | Giving context unfortunately doesn't really work as ChatGPT | usually prioritizes what it's absorbed through the corpus | over anything you tell it. | | To be clear, it does _fine_ with new information if the | things you ask it for don 't match token sequences it's | already been trained on, so if you give it a fictional | library and ask it to perform some task with it, that doesn't | seem too much like the things it might do with another | library that accomplishes a similar thing with a similar API, | it will actually use the custom code more successfully. | | But for Vitest, it can't accept enough of the docs you might | provide for it to be useful to you (though admittedly, | sometimes it will show how to do something with jest that at | least makes finding the right thing in vitest easier). | | By the way, if you are planning to switch over in the future, | the path for doing that is seemingly well documented by | vitest and seems to be pretty straightforward as well, though | I haven't meaningfully used Jest for comparison | | edit: to be clear, I'm very impressed with ChatGPT's | capabilities, and I think there are good examples of | prompting where it does meaningful work in tandem with the | human driver exercising their own judgment. | | This was an example of a person asking it for things while | not pointing out its limitations, which downplays the extent | to which one needs to exercise one's judgment when using it. | If they failed to point out the things ChatGPT got wrong | which _I_ know about, why would I trust that the other things | I don 't know it got wrong are accurate. | [deleted] | upwardbound wrote: | Public service announcement that myself and others are actively | trying to poison the training data used for code generation | systems. https://codegencodepoisoningcontest.cargo.site/ | | See previous discussion here: | https://news.ycombinator.com/item?id=35545442 | irrational wrote: | Maybe this will get people to finally sit down and do some | thinking, planning, pseudo-code, etc. before diving in and | starting to code. | afro88 wrote: | The article shows everything that works for this approach. But | it's a bit disingenuous. At the end: | | > Once this is working, Xu Hao can repeat the process for the | rest of the tasks in the master plan. | | No, he can't. After that much back and forth and getting it to | fix little things where it gives responses with the full code | listing again, he would have easily hit the token limit (at least | with any chat LLM capable of this quality code and conversation - | ChatGPT). The LLM will start hallucinating the task list, the | names of functions it wrote earlier etc. and the responses would | get less and less useful with more and more "this doesn't work, | can you fix X". | | So anyone following this approach will hit a footgun after task | 1. | | For anyone that really wants to follow this approach, the next | step is to start a new chat and copy/paste the inital requirement | prompt, put the task list in there, any relevant code, adjust the | instruction (ie "help me with task 2") and go from there. | | It is of limited utility though. By step 3 (or even 2) you end up | with so much code that you're at the token limit anyway and it | can't write code that fits together. | | Where I've found ChatGPT 4 useful is getting me going on | something, providing boilerplate, and unblocking me. | | If you don't know how to approach a problem like the "awareness | layer" (like I didn't before reading the post), you can get a | great breakdown and starting point from ChatGPT. Similarly, if | you're not sure how to approach that view model, or write tests | etc. And if you want a first draft of code or tests. | | All that said, I'm looking forward to much larger and affordable | token limits in future. | Tostino wrote: | You iterate on your plan after it is generated step by step. | You go and edit the prompt chain you started working on step 1 | on, and modify it to start working on step 2 (including any | ideas or fixes you have identified while implementing step 1. | Repeat until complete. | | You can still absolutely hit the context limit, but you are far | less likely to do so if you go back and start a new prompt | chain for each different thought process you are going through | with it. | afro88 wrote: | Great idea. But does it get hard to navigate back to | something in older chat histories though? | | I find a new separate chat with the revised initial prompt to | be easier. | williamcotton wrote: | I've been using another call to an LLM to write or rewrite | code that is separate from the main "conversation". | | What I mean is that I've got a dialog going with an LLM and | I've trained it to call a build() function with | instructions that then returns the function, with the text | of the function kept out of the dialog with the main | thread. | ryanjshaw wrote: | Your experience matches mine closely. I've had ChatGPT-4 do | great and then it just gets confused after a while. I can | literally tell it "task X is done" and it'll apologise and show | me a list of tasks where X is still not done - this is clearly | not just a context window issue, as I have repeated variations | of my statement over and over in the same session and the issue | persists. | | I have ended up using it the same way you have - it's honestly | the best anti-procrastination tool I've ever used because I can | tell it my intentions, what I've thought of so far... and it'll | spit out a list of bite-sized chunks that get me going. I find | myself looking forward to telling the AI I've completed a task. | | Similarly, if I'm facing a tricky design decision, I find that | just writing it out for ChatGPT is extremely helpful for | clarifying my thought process. I actually used to do this | conversational decision making process in a text editor long | before ChatGPT, but when I know there's an AI on the other end | my thinking becomes clearer and more goal-oriented. And unlike | talking to myself or a human friend, it's happy to just say | "well if these are your concerns, let's start HERE and then see | what happens". | peterashford wrote: | Ooh! That's a really good point - ChatGPT is effectively | rubber-ducky as a service =) | Fuzzwah wrote: | This is exactly how I've been explaining LLM tech to my | "non-geek" friends and family. I start by explaining rubber | ducking, and how I now use chatgpt as a more advanced | version of the process. | travisjungroth wrote: | Good rule of thumb with ChatGPT: you can't exit loops. Once | you've gone A > B > A, your best move is to start a new chat. | Even then it may reproduce and you should do some similar but | different task. Remember that it's a prediction engine, | weighing heavily on the existing prompt. So you say B again, | or B1 and it's like, I know what to do! A! Cause last time | was A->B so let's do it again. | | In your case this would be "[]Task1", "Task1 is done", | "[]Task1", [here is where you start a new chat or fix it | yourself if possible]. | hn_throwaway_99 wrote: | Hmm, I also use ChatGPT as an anti-procrastination tool and | task manager, and it's never made a mistake with keeping | track of my task list (except that when it sums the estimated | times of subgroups of tasks, sometimes those sums are wrong). | | Note that it outputs my updated task list every time I add or | remove a task (I only asked it to do that one time), so even | if old messages go outside of the context window, it's not a | big deal because the full updated state of the list is output | basically every other message. | quijoteuniv wrote: | It's great to see that there's now a term for the type of | prompting, "generated knowledge". I've been experimenting with | this technique since the beginning, and I've noticed a | significant improvement in version 4. The process involves | outlining the project, creating tasks, and feeding them back to | chatGPT as you progress. This approach has helped me complete | projects that would have otherwise taken me much longer to | finish. | | It's also useful for creating practical tutorials. While there | are plenty of tutorials available online, sometimes you need | guidance on a specific set of technologies. By using generated | knowledge prompts, you can get a good outline and tasks to help | you understand how these technologies interact. | | One thing to keep in mind is to avoid derailing the | conversation with questions that are not relevant to the core | tasks. If you get stuck on something and need to debug, it's | best to use a separate conversation to avoid derailing the | project's progress and the allucinations & forgettingness | AzzieElbab wrote: | Something must be wrong with me. I could never get anything | useful from Martin Fowler's writings, and coincidentally I | cannot get any functional code out of ChatGPT. Even the | boilerplate it produces for me needs to be corrected. I still | use chatGPT to produce examples of abstract things but was | not able to get any working code that matches concrete | problems or even compiles. | afro88 wrote: | Are you using the GPT4 model? There's a very significant | improvement between 3.5 (the free one) and 4. | AzzieElbab wrote: | I am supposedly on GPT4 via GPT+. I try using it for | boilerplatey things, like terraform, and the results are | simply incorrect. It seems more helpful in providing | examples, even for some far more complex tech - like rust | code. | simonw wrote: | Does it say GPT-4 at the top of the screen? | afro88 wrote: | Absolutely, and same here. I've done multiple tools that | would have taken 2-3 days each in 2-3 hours each. | | > One thing to keep in mind is to avoid derailing the | conversation with questions that are not relevant to the core | tasks. If you get stuck on something and need to debug, it's | best to use a separate conversation to avoid derailing the | project's progress and the allucinations & forgettingness | | Definitely. Great advice. | | Another tip: don't bother asking it to fix small things. Just | mention you fixed it in the next reply and move on. | hinkley wrote: | What I would really love is if we had a broader linting tool | built on this sort of tech that could go the other way. | | So often we are halfway through refactoring the code from a bad | pattern that has a proven track record of issues, to one that at | least prevents the worst ravages of the old one. There are never | any guarantees that you will get everyone on board for this. | Someone will defect, and they will keep copying and pasting the | old pattern and if they code faster than you then you never get | to the end. | | Give me a way to mark a bunch of code as 'the old way' and hook | that information into autocomplete or even just a linter that | runs at code review time. | 1024core wrote: | For some reason, this reminds me of how we used to give | instructions to Indian coders in the 90s and early 2000s. You | would have to spell out everything. What you got back was nearly | there, but some back-and-forth was involved. | | This brings back some terrible memories. | mpaepper wrote: | The big difference is that you get the results immediately and | iterations take minutes not days | DonHopkins wrote: | And no time zone differences! | krupan wrote: | Yes, you can a ton more code that you have to check over with | a fine toothed comb in much less time! Is that a win? | m3kw9 wrote: | Asking LLM to write complex code could approach the speed of | writing the code yourself. Having it plan it out could however | kick start a nice direction. LLMs are great for single clear | function code. | acomjean wrote: | Isn't the point of code to express what we want in a succinct and | expressive way. | | If we're needing all this software to help us, maybe we should | look at the languages we're using and make better more intuitive | ones. | cwp wrote: | To me, this is a great illustration of why chat is a terrible | interface for a coding tool. I've gone down this path as well, | learning that you need to have a detailed prompt that establishes | a lot of context, and iteratively improve it to generate better | code. And yup, generating a task list and working from that is | definitely a key strategy for getting GPT to do anything bigger | than a few paragraphs. | | But compare that to Copilot: Copilot doesn't help much when | you're starting from scratch, and there's nothing for it to work | with. But once you have a bit of structure, it starts to make | recommendations. Rather than generating large chunks of code, the | recommendations are small, chunks of a few lines or maybe even | one line at a time. And it's sooooo good at picking up on | patterns. As soon as you start something with built-in | symmetries, it'll quickly generate all the permutations. It's | sort of prompting by pointing. | | This is so. much. better. than writing prompt for the chat | interface. I'm really excited to see where these kinds of tools | lead. | mjr00 wrote: | Absolutely. People will quickly realize that for coding, the | natural language part of LLMs is a distraction. Copilot is | _much_ better for someone actually writing code, but | unfortunately doesn 't get as emphasized due to the narrative | surrounding LLMs right now. | moffkalast wrote: | Has the Copilot backend been updated to use anything more | advanced yet? I tried it out when it was new and free for a | while and it really struggled with anything that wasn't | incredibly common. GPT 4 in its chat form works a whole lot | better for niche stuff than that one did. | gunapologist99 wrote: | It's definitely far better than when it was free but not | GPT4 yet for most people. | | It's the opposite of chatGPT: it takes more time to produce | useful output but it gets much better in more complex | programs while ChatGPT gets worse. | kgeist wrote: | Copilot's original underlying model is currently | deprecated, if I remember correctly | yodsanklai wrote: | > Copilot is much better for someone actually writing code | | I haven't used copilot yet, but I'm using occasionally | chatgpt with prompts such as "write a bash/python script take | takes these parameters and perform this tasks". Then I | iterate if needed, and usually, i can get what i want faster | than without using chatgpt. It's not a game changer, but it's | a performance boost. | | How natural language is a distraction here? and how copilot | would do much better for the same task? | visarga wrote: | > It's not a game changer, but it's a performance boost. | | The story of all AI in 2023 - maybe 2x performance | improvement, maybe a bit less. The big problem is that you | can't trust it on its own, so it doesn't improve | productivity 100x. Not even a receipt reader is good enough | to reach 100%, you got to check the total, maybe it missed | the dot and you get the 100x boost after all. | mjr00 wrote: | Try not using natural language and just type what you'd | type into Google. You'll get the same results and realize | that all of the natural language fluff is totally | unnecessary. I just typed in "bash script recursive chmod | 777 all files" (as a dumb toy example) and got a resulting | script back. It was surrounded by two natural language GPT | comments: | | > It's generally not recommended to give all files and | directories the 777 permission as it can pose a security | risk. However, if you still want to proceed with this, | here's a bash script that recursively changes the | permission of all files and directories to 777: [...] Make | sure to replace "/path/to/target/directory" with the path | of the directory you want to modify. To run the script, | save it as a file (e.g., "chmod_all.sh"), make it | executable with the command "chmod +x chmod_all.sh", and | then run it with "./chmod_all.sh". | | It's up to the reader to decide if those are necessary, but | I'd lean towards no. | [deleted] | gunapologist99 wrote: | No script needed: chmod ugo+rwX . -R | | (This is for GNU chmod like in Linux, BSD will be | slightly different) | | Of course, that's not exactly what you asked for (it's | better, read the chmod man page: X applies executable | only to directories) but you could just replace ugo+rwX | with 777 or 0777. | kenjackson wrote: | I tried this with the following: | | "Bash script to add a string I specify to the beginning | of every file in a directory, unless the file begins with | "archive"" | | I tried looking for this on Google and didn't find | anything that did this -- although I could cobble | together a solution with a couple of queries. | | The interesting thing is that I wanted ChatGPT to append | the string to the filename -- that's what I meant. But it | actually append the string to the actual file. That's | actually what I said, so I give it credit for doing what | I said, rather than what I meant. And honestly my intent | isn't necessarily obvious. | | I definitely see this as a value add over just searching | with Google. | scarface74 wrote: | > Try not using natural language and just type what you'd | type into Google. You'll get the same results and realize | that all of the natural language fluff is totally | unnecessary. | | I can get _similar_ results with Google sometimes and I | can put together what I learned from different places. | | But I can get scripts that meet my exact requirements | with ChatGPT. Most of my ChatGPT related code is | scripting AWS related code and CloudFormation templates. | | I've asked it to translate AWS related Python code to | Node for a different projects and a bash shell script. | It's well trained on AWS related code. | | I don't know PowerShell from a hole in the wall. But I | needed to write PS scripts and it did it. I've also used | it to convert CloudFormation to Terraform | mjr00 wrote: | I think you (and kenjackson above) are misinterpreting | what I was saying. I'm not saying use Google instead of | ChatGPT; I'm saying _pretend ChatGPT is Google_ and | interact with the ChatGPT text prompt the same way. You | don 't need fully formed coherent sentences like you | would when talking to a person; just drop in relevant | keywords and ChatGPT will get you what you want. | scarface74 wrote: | Isn't that the game changer though that you can use | natural language and treat it like the "worlds smartest | intern" and I can just give it the list of my | requirements? | | It's the difference between: | | "Python script to return all of the roles with a given | policy AWS" (answer found on StackOverflow with Google) | | And with ChatGPT | | "Write a Python script that returns AWS IAM roles that | contain one or more policies specified by one or more -p | arguments. Use argparse to accept parameters and output | the found roles as a comma separated list" | mjr00 wrote: | > "Write a Python script that returns AWS IAM roles that | contain one or more policies specified by one or more -p | arguments. Use argparse to accept parameters and output | the found roles as a comma separated list" | | Again, this is completely unnecessary. This is like in | the old days when technically illiterate people would | quite literally Ask Jeeves[0] and search for full | questions because they didn't know how to interface with | a search engine. | | A prompt that does exactly what you're asking: "python | script get AWS IAM roles that contain a policy, policy as | -p command line argument, output csv" | | We'll see more of that terse, efficient, style as people | get more comfortable, similar to how people have (mostly) | stopped using full questions to search on Google. The | "talk to ChatGPT like a human" part is entirely a | distraction from taking advantage of the LLM for coding | purposes. Perhaps more importantly, the responses being | humanized is a distraction, too. | | [0] https://en.wikipedia.org/wiki/Ask.com | scarface74 wrote: | At first, when I didn't specify "use argparse" it would | use raw argument parsing | | It also thought I actually wanted a file called | "output.csv" based on your text and gave me an actual | argument to specify the output file that I didn't want. | | There is a lot of nuance to my requirements that ChatGPT | missed with your keywords. | | Sidenote: there is a bug in both versions and also when I | did this for real. Most AWS list APIs use pagination. You | have to tell it that "this won't work with more than 50 | roles" and it will fix it. | vorticalbox wrote: | you can always include the instruction to only return the | code and no other text | mjr00 wrote: | Sure, but I want a system built for coding that does that | by default... like Copilot. | iudqnolq wrote: | ... and it'll describe the code anyway, at least to me. | avereveard wrote: | Idk for sure autocomplete is a great interface for someone in | the ide coding, but LLM can understand requirements whole and | spit out full classes and validate that the output from the | server matches the specs, they work great from outside an | ide. | supernikio2 wrote: | Exactly this. I've tried to implement ChatGPT into may daily | workflow, but you have to give it an excruciating level of | detail to get something that remotely resembles real code I'd | use, and even then you have to hold its hand to guide it in the | correct direction, and still have to make some manual final | touches at the end. | | This is why I'm looking forward to Copilot X so much. It will | hold much more context than the current implementation, and | integrate the Chat interface that's so natural to us. | maroonblazer wrote: | As a hobbyist developer with no formal training, I wish Copilot | had a 'teaching' or "Senior Dev" mode, where I can play the | role of the Junior Dev. I'd like it to pick up on what I'm | trying to write, and then prompt me with questions or hints, | but not straight up give me the code. | | Or, if that's too Clippy-like annoying, let me prompt it when | I'm stuck, and only then suggest hints or ask suggestive | questions that guide me to a solution. | | I agree, very exciting to see where all this goes. | cwp wrote: | One thing you might try with Copilot is to ask it to explain | the code. It can often give insight, even on code that you | yourself wrote a few minutes ago. | ukuina wrote: | The Github Copilot Labs extension has "codebrushes" that can | transform and explain existing code instead of generating new | code, but none of it only gives "hints". Maybe one of the | codebrushes can take a custom prompt. | SparkyMcUnicorn wrote: | You can create custom brushes, or open the "CoPilot Labs" | panel and "explain" with a custom prompt. | SamPatt wrote: | I've noticed that after using copilot on a code base for a | while, you can effectively prompt the AI just by creating a | descriptive comment. | | // This function ends the call by sending a disconnection | message to all connected peers | | Bam, copilot will recommend at least the first line, with | subsequent lines usually being pretty good, and more and more | frequently, it will recommend the whole function. | | I still use GPT-4 a lot, especially for troubleshooting errors, | but I'm always pleasantly surprised at how good copilot can be. | armchairhacker wrote: | Copilot is a game-changer and very underrated IMO. GPT4 is | smart but not really used in production yet. Copilot is | reportedly generating 50% of new code and I can't imagine | going without it. | Keyframe wrote: | I would really love to see that. So far, all I've seen is | cookie cutter code to reduce a bit of typing time. | Everything else was more or less hot garbage that just | stood in the way of typing. Maybe in a few iterations or | years. So far, personally, I haven't seen anything useful. | Not saying there isn't anything, just that I haven't seen | any use and code offered by it stank. Is there a demo of | someone using it to showcase this game-changing power? | armchairhacker wrote: | Copilot only writes boilerplate, it can't really handle | anything non-trivial. But I write a lot of boilerplate, | even using abstraction and in decent programming | languages. A surprising amount of code is just | boilerplate, even just keywords and punctuation; and | there's a lot of small, similar code snippets that you | _could_ abstract, but it would actually produce more code | and /or make your code harder to understand, so it isn't | worth the effort. | | Plus, tests and documentation (Copilot doubles as a good | sentence/"idea" completer when writing). | SamPatt wrote: | It surprises me to hear this. Have you used it as I | described by writing a descriptive comment first then | waiting to see its response? | | I only noticed it getting good at this after I was | somewhat far along on a project, so I assume it requires | an overall knowledge of what you're trying to do first. | smashface wrote: | Where do you get that 50% number? Do you mean 50% of all | new code in the industry? That seems beyond extremely | unlikely. | moyix wrote: | The number is 40%, and it's 40% of code written _by | Copilot users_. It 's also just for Python: | | > In files where it's enabled, nearly 40% of code is | being written by GitHub Copilot in popular coding | languages, like Python--and we expect that to increase. | | https://github.blog/2022-06-21-github-copilot-is- | generally-a... | Nifty3929 wrote: | It's all about the denominator! | iudqnolq wrote: | I wonder if this properly counts cases where copilot | writes a bunch of code and then I delete it all and | rewrite it manually. | moyix wrote: | From what I remember they check in at a few intervals | after the suggestion is made and use string matching to | check how much of the Copilot-written code remains. | jvanderbot wrote: | There was some discussion by the copilot team that x% of | new code _in enabled IDEs_ was generated by copilot. | | It varies, but here's one post with x=46 from last month. | So, very close to half. | | https://github.blog/2023-02-14-github-copilot-for- | business-i... | 2devnull wrote: | Measuring output by LOC is not a very useful metric. The | sort of code that's most suited to ai is closer to data | than code. | [deleted] | fnordpiglet wrote: | (I read it as 50% of their code) | jvanderbot wrote: | For my side projects, copilot easily generates 80% of the | code. It snoops around the local filesystem and picks up my | naming schemes and style to help recommend better. It makes | me so much more productive. | | For work projects, I tried it on some throwaway work | because we're still not allowed to use it for IP reasons, | but it is very good at finding small utility functions to | help with DRY, and can help with step by step work, but | can't generate helpful code quite as easily since some of | our API and codebase just doesn't follow its own norms or | conventions, and it seems to me that copilot makes a lot of | guesses based on its detected conventions. | iudqnolq wrote: | > It snoops around the local filesystem and picks up my | naming schemes and style to help recommend better. | | Are you sure about this? It doesn't seem to work on my | machine. I think it will infer things that might be in | other modules, but only based on the name. I'm basing | this on the fact it assumes my code has an API shape | that's popular but that I don't write (eg free functions | vs methods). | armchairhacker wrote: | It looks at your recently-viewed files in your IDE. I | don't think it looks at anything outside your open | workspace but maybe... | throwaway202303 wrote: | People have different preferences and habits. Having tried both | models I much prefer having a conversation in one window and | constructing my code from that in another. Although copilot is | about to add some interesting features that may win me back. | barbariangrunge wrote: | Either way, you're sending your companys biggest asset to | another company, aren't you? I'll try these tools when they | start being able to run locally | koonsolo wrote: | I surely hope they use my copyrighted code and make millions | out of it. Ideal case for me to sue them for lots of money. | freedomben wrote: | How would you ever know? It will come in chunks of a dozen | or less lines at a time and it will be written into your | competitor's proprietary codebase (that you don't have | access to). | survirtual wrote: | Right. | | If you are building something truly valuable locally, and | it is innovative or otherwise disruptive and relies on | being a first mover, centrally hosted LLMs are a non- | starter. | | Most software corps have countless millions of lines of | code. You'd be spending lifetimes tracing where someone | ripped your "copyrighted" techniques and methods. | | The complete lack of security awareness and willingness | to compromise privacy for convenience in people deeply | saddens me. | HALtheWise wrote: | > GitHub Copilot [for business] transmits snippets of | your code from your IDE to GitHub to provide Suggestions | to you. Code snippets data is only transmitted in real- | time to return Suggestions, and is discarded once a | Suggestion is returned. Copilot for Business does not | retain any Code Snippets Data. | | Likely, some employee would whistleblow that they're not | complying with their privacy policy, and either | government litigation or a class action lawsuit would | ensue. That legal process would involve subpoenas and | third-party auditors being granted access to | GitHub/Microsoft's internal code and communications | history, which makes it pretty hard to hide something as | big as collecting, storing, and then training from a huge | amount of uploaded code snippets they promised not to. | | It's not inconceivable that they're noncompliant, but my | bet would be that if they _are_ collecting data they | explicitly promise not to it 's an accidental or | malicious action by an individual employee, and they will | _freak out_ when they discover it and delete everything | as soon as they can. If they intended to collect that | data, it would be much easier to write that into the | policy than deal with all the risk. | | Notably, this applies to Copilot for Business, which is | _presumably_ what you 're using if you are at work. | zelphirkalt wrote: | Couldn't it happen more subtly, without having the code | lying around for long? The model could be doing online- | learning (ML term) and only then they discard code that | they get send. This means your code could appear in other | people's completions/suggestions, without it having to | lie anywhere. It is basically learned into the model. The | code could appear almost or even completely verbatim on | someone else's machine, possibly working for a | competitor. Even that it is your code would not be | obvious, because MS could claim, that Copilot merely | accidentally constructed the same code from other learned | code. | | Not sure that this is how the model works, but it is | conceivable. | SanderNL wrote: | I sort of disagree that code is the biggest asset. Take the | Yandex leak. What can you do with it? Outcompete them? | visarga wrote: | > Take the Yandex leak. What can you do with it? | | Obviously, add it to the big training set of the next code | model. | throwaway202303 wrote: | No or no company would be able to use it. As you type | fragments of code are sent and discarded after use. You need | to trust Microsoft to actually do the discarding but | contractually they do and you can sue them if they | accidentally or deliberately keep your code around or | otherwise mismanage it. | xiphias2 wrote: | They are obligated to give data to the government, and | government took part of spying in Brazil for Boeing in the | past, but I guess they are using this capability only for a | few strategic companies, and most companies are not that. | zelphirkalt wrote: | But that is naive, isn't it? Who has the money and time in | their life, to actually sue MS? Even if "you" is a | business, few will have the resources for that. | justinhj wrote: | Individuals do not (although a class action would be | feasible), but large companies that use Github and other | Microsoft products, of course they have both the means to | sue Microsoft and the motivation should their business be | impacted. | throwaway202303 wrote: | Exactly | [deleted] | themodelplumber wrote: | If somebody thinks an LLM is coming for everybody's coding job, | I'd say this article is a great counterpoint just for existing. | | You could tell someone from decades ago that we now use a very | high level language for complex tasks in complex code ecosystems, | never even mention AI, explain that the parser is really | generalist-biased, and this article would make perfect sense as | an example of exemplary code by a modern coder working for a | living. | | That's code in there, the stuff Xu Hao is writing. | | And also, that's not even getting into the debugging part... | Which will be about other code, that looks different. | notacoward wrote: | The problem is that it's not _quite_ code. It 's _almost_ code, | but without the precision, which puts it into a sort of Uncanny | Valley of code-ness. It 's detailed instructions for someone to | write code, but the someone in this case is an alien or insane | or on drugs so they might interpret it the way you meant it or | they might go off on some weird tangent. You never know, and | that means you'll need to check it with almost as much care as | you'd take writing it. | | Also, having it write its own tests doesn't mean those tests | will themselves be correct let alone complete. This is a | problem we already have with humans, because any blind spot | they had while writing the code will still be present for | writing the tests. Who _hasn 't_ found a bug in tests, leading | to acceptance of broken code and/or rejection of correct | alternatives? There's no reason to believe this problem won't | also exist with an AI, and they have more blind spots to begin | with. | Veedrac wrote: | 'Artists' jobs are safe because AI is bad at hands.' | themodelplumber wrote: | Artists' jobs are safe in part because they can also use AI, | and most already use relevant ecosystems that now incorporate | AI. | | Consumers who can operate AI for clip art purposes are simply | still part of the same non-artist-paying demographic they | always were. | | Same with code | Veedrac wrote: | As farmers' jobs were safe because farmers can use farming | tools. | | These arguments don't track even vaguely. You are doing the | equivalent of analyzing the future of solar power by | assuming solar will cost the same in 10 years as it does | today, and that each new watt of solar is matched 1:1 with | new units of demand. Neither of these are sensible. | | It may be that ML code tools never displace many people, or | even that they supercharge demand, but you don't get to | justified conclusions by assuming the future is just the | present but with a bigger UNIX timestamp. | all2 wrote: | Industrialization has made farming tools incredibly | complex, so I believe the statement "farmers' jobs were | safe because farmers can use farming tools" is correct. | You still need a farmer to farm, but you now need less | manpower to farm. The specialist is secure while the | untrained laborer is at risk. | ldhough wrote: | Sadly I don't think this is true for art: | | https://restofworld.org/2023/ai-image-china-video-game- | layof... | | I really hope it doesn't end up being the same with code :| | twelve40 wrote: | Exactly, I actually liked the systematic approach in the | article, but it seemed pretty labor-intensive and ... not that | much different from other types of programming | sanderjd wrote: | To me, that's the whole point of this. I think it is directly | analogous to the jump between assembly and higher level | compiled languages. You could have said about that, "it still | seems pretty labor intensive and not that much different than | writing assembly", and that's true, but it was still a big | improvement. Similarly, AI-assisted tools haven't solved the | "creating software requires work" problem. But I think | they're in the process of further shifting the cost curve, | making more software possible to make. | nextworddev wrote: | The opposite might be true, and here's why - 1) by using | English as spec, the barrier of entry has gone lower, 2) LLMs | can also write prompts and self introspect to debug. | notacoward wrote: | > LLMs can also write prompts and self introspect to debug. | | Why should we assume that won't lead to a rabbit hole of | misunderstanding or outright hallucination? If it doesn't | know what "correct" really is, even infinite levels of | supervision and reinforcement might still be toward an | incorrect goal. | ModernMech wrote: | It's like when you continually refine a Midjourney image. | At first refining it gets better results, but if you keep | going the pictures start coming out...really weird. It's up | to the human to figure out when to stop using some sort of | external measure of aesthetics. | ben_w wrote: | To which the normal response[0] is: that's just like | humans. | | Of course, it's still bad that humans do it; but despite | the scientific method etc., even successful humans often | work towards an incorrect goal. | | [0] I am cultured, you're quoting memes, that AI is just a | stochastic parrot: | https://en.wikipedia.org/wiki/Emotive_conjugation | notacoward wrote: | But it's _not_ just like humans. For one thing it 's | built differently, with a different relationship between | training and execution. It doesn't learn from its | mistakes until it gets the equivalent of a brain | transplant, and in fact extant AIs are _notorious_ for | doubling down instead of accepting correction. Even more | importantly, the AI doesn 't have real-world context, | which is often helpful to notice when "correct" (to the | spec) behavior is not useful, acceptable, or even safe in | practice. This is why the idea of an AI controlling a | physical system is so terrifying. Whatever requirement | the prompter forgot to include will not be recognized by | the AI either, whereas a human who knows about physical | properties like mass or velocity or rigidity will | _intuitively_ honor requirements related to those. Adding | layers is as likely to magnify errors as to correct them. | ben_w wrote: | > But it's not just like humans. For one thing it's built | differently | | I'm referring to the behaviour, not the inner nature. | | > in fact extant AIs are notorious for doubling down | instead of accepting correction. | | My experience suggests ChatGPT is _better_ than, say, | humans on Twitter. | | I've had the misfortune of several IRL humans who were | also much, much worse; but the problem was much rarer | outside social media. | | > Even more importantly, the AI doesn't have real-world | context, which is often helpful to notice when "correct" | (to the spec) behavior is not useful, acceptable, or even | safe in practice. | | Absolutely a problem. Not only for AI, though. | | When I was a kid, my mum had a kneeling stool she | couldn't use, because the woodworker she'd asked to | reinforce it didn't understand it and put a rod where | your legs should go. | | I've made the mistake of trying to use RegEx for what I | thought was a limited-by-the-server subset of HTML, | despite the infamous StackOverflow post, because I | incorrectly thought it didn't apply to the situation. | | There's an ongoing two-way "real-world context" miss- | match between those who want the state to be able to | pierce encryption and those who consider that to be an | existential threat to all digital services. | | > a human who knows about physical properties like mass | or velocity or rigidity will intuitively honor | requirements related to those | | Yeah, kinda, but also no. | | We can intuit within the range of our experience, but we | had to invent counter-intuitive maths to make most of our | modern technological wonders. | | -- | | All that said, with this: | | > It doesn't learn from its mistakes until it gets the | equivalent of a brain transplant | | You've boosted my optimism that an ASI probably won't | succeed if it decided it preferred our atoms to be | rearranged to our detriments. | notacoward wrote: | > I'm referring to the behaviour, not the inner nature. | | Since the inner nature does affect behavior, that's a | _non sequitur_. | | > we had to invent counter-intuitive maths to make most | of our modern technological wonders. | | Indeed, and that's worth considering, but we shouldn't | pretend it's the common case. In the common case, the | machine's lack of real-world context is a disadvantage. | Ditto for the absence of any actual understanding beyond | "word X often follows word Y" which would allow it to | predict consequences it hasn't seen yet. Because of these | deficits, any "intuitive leaps" the AI might make are | less likely to yield useful results than the same in a | human. The ability to form a coherent - even if novel - | theory and an experiment to test it is key to that kind | of progress, and it's something these models are | fundamentally incapable of doing. | ben_w wrote: | > Since the inner nature does affect behavior, that's a | non sequitur. | | I would say the reverse: we humans exhibit diverse | behaviour despite similar inner nature, and likewise | clusters of AI with similar nature to each other display | diverse behaviour. | | So from my point of view, that I can draw clusters -- | based on similarities of failures -- that encompasses | both humans and AI, makes it a non sequitur to point to | the internal differences. | | > The ability to form a coherent - even if novel - theory | and an experiment to test it is key to that kind of | progress, and it's something these models are | fundamentally incapable of doing. | | Sure. | | But, again, this is something most humans demonstrate | they can't get right. | | IMO, most people act like science is a list of facts, not | a method, and also most people mix up correlation and | causation. | have_faith wrote: | English as a spec is incredibly "fuzzy", there are many valid | interpretations of intent. I don't think that can be avoided? | sarchertech wrote: | It can't. Legalese is an attempt to do so, and it's | impenetrable by non experts and still frequently ambiguous. | mooreds wrote: | > by using English as spec, the barrier of entry has gone | lower, | | I'm not sure that is true. The level of back and forth and | refinements needed indicate to me that the "English" used is | not the normal language I use when talking to people. | | It's almost like a refined version of cucumber with syntax | that is slightly more forgiving. | | Maybe I'm being a codger, but LLMs seem (at least for now) | far better for summarizing and giving high level overviews of | concepts rather than nailing precise code requirements. | ben_w wrote: | > It's almost like a refined version of cucumber with | syntax that is slightly more forgiving. | | I don't know if "cucumber" is autocorrupt or an actual non- | vegetable thing; can you clarify? | omnicognate wrote: | https://cucumber.io/ | | That "did they actually mean that or was it autowrong?" | feeling is going to get worse I fear. | mjr00 wrote: | Not a typo.[0] | | In the 00s/early 10s, software went through a fad phase | where people earnestly thought that by implementing | Gherkin frameworks like Cucumber, you'd be able to hand | off writing tests to "business people" in "plain | English." It went about as well as you'd expect. | | [0] https://cucumber.io/docs/gherkin/ | ben_w wrote: | Thanks! | | Despite that period being when I finished my Software | Engineering degree, got my first job, and then attempted | self-employment, I'd never heard of it before. | | Looking at the book titles -- "Cucumber Recipes" in | particular -- even if I had encountered it, I might have | assumed the whole thing was a joke. | gwright wrote: | https://cucumber.io/docs/guides/overview/ | agentultra wrote: | But you can't determine if a statement is true by simply | reading more words. | | It's also not efficient for doing higher level work. There | was a time before we had algebra where people were still | expressing the same ideas but the notation wasn't there. | Mathematics was expressed in "plain language." It's extremely | difficult to read for us. For mathematician's of the time | there was no other way to explain algorithms or expressions. | | For simple programs I have no doubt that these tools enable | more people to generate code. | | However it's not going to be helpful for people working on | hypervisors, networking stacks, operating systems, | distributed databases, cryptography, and the like yet. For | that you need a more precise language and an LLM that can | _reason_ about semantics and generate understandable proofs: | _not boilerplate_ proofs either -- they have to be elegant so | that a human reading them can understand the problem as well. | We 're still a ways from being able to do that. | nextworddev wrote: | Arguably reading code can't lead to definitive conclusions | about its bug-free-ness | agentultra wrote: | Precisely! And neither can generating a handful of unit | tests. As EWD would say, _they only prove the existence | of one error._ Not that there are no errors. | | If we want more programs that are correct with respect to | their specifications we need to write better, precise | specifications... not wave our hands around. | | However for a lot of line-of-business tasks we're | generally fine with ambiguous, informal specifications. | We're not certain our programs are correct with respect | to the specifications, if we had written them out | formally, but it's good enough. | | I think most businesses that are writing software that | needs to be reliable and precise are not going to benefit | from these kinds of tools for some time. | tjr wrote: | This is true in aerospace software. Lots of process, lots | of specification, lots of verification. I wouldn't want | to say that GPT-seque tools would be useless here, but I | really don't see them offering the same kind of magic | leverage that they might offer on some other projects. | | And vice-versa! Most software projects do not benefit | from the rigor used in aerospace, because it's just not | needed, and would be a waste of time. | | I am definitely seeing ways that GPT tools could speed up | some aerospace work, but we need to be really really sure | that things are being done correctly... not just mostly | correct, or seemingly correct. | staunton wrote: | Reading and proving a spec can though. LLMs are in | principle capable of doing that. (If your objection is | that the spec might have bugs then "bug free" is | subjective and nothing at all can ever lead to definitive | conclusions about it) | gumballindie wrote: | I mean sure if the world were to run on basic code. Perhaps | wordpress developers may feel slightly threatened by even | that is well above all examples of a"i" code i've seen. | ZephyrBlu wrote: | I think English as a spec actually makes the barrier of entry | higher, not lower. Code itself is far easier to understand | than an English description of the code. | | To understand an English description of code you already have | to have a deeper understanding of what the code is doing. For | code itself you can reference the syntax to understand what's | going on. | | The prompt in this case is using very technical language that | a beginner will have no idea about. But if you gave them the | code they could at least struggle along and figure it out by | looking things up. | nextworddev wrote: | Yes but LLMs can also be used by laypeople to explain the | issue in plain English too. That's the problem. Not that | LLMs would need human to guide the debugging process | anyways (at least in a few years) | ZephyrBlu wrote: | You still have the same problem... You cannot describe a | technical field with plain English. If you did so the | semantics would be incorrect. There is a reason jargon | exists. | | The first two paragraphs alone are absolutely chock with | terms that would not be easily explained to a layperson: | | _" The current system is an online whiteboard system. | Tech stack: typescript, react, redux, konvajs and react- | konva. And vitest, react testing library for model, view | model and related hooks, cypress component tests for | view._ | | _All codes should be written in the tech stack mentioned | above. Requirements should be implemented as react | components in the MVVM architecture pattern._ " | | What is every library in that list? What is a model? What | is a view model? What is a hook, component test, view, | MVVM, etc? | | If a layperson could understand explanations for all | these things then they would not be a layperson. | [deleted] | lcnPylGDnU4H9OF wrote: | This reminds me of rubber ducking[0] in how it necessitates | a certain understanding. If one is able to explain it in | plain English it's because it is understood. | | [0] https://en.wikipedia.org/wiki/Rubber_duck_debugging | bartimus wrote: | But there's still going to have to be a human who has the | ability to form a mental model of the thing that's needing to | be implemented. Functionally and technically. The results of | the LLM will vary depending on the level of know-how the | human instructor has. | SanderNL wrote: | Except you now have a way "upwards" from an abstraction POV. | Regular code is severely limited and highly surgical, by | design. This is not. | | All these abstraction layers were invented to serve old style | manual coders. Why bother explaining in great detail about | "Konva" layers and react anymore? Give it a few years and let | it finetune on IT tech and I see this being reduced to "I want | whiteboard app with X general characteristics" at which point | I'd no longer speak about "programming". | themodelplumber wrote: | That "upwards" excludes a lot of relevant systems design | logic that won't go away though, insofar as it is abstraction | ad infinitum in the direction of fewer-relevant-details. | | What'll happen is, details will continue to be relevant as | tastes adjust to the new normal. | | Like for my work, today, React is enterprise-ready, which is | not good for me. It means it will likely dip my projects in | unnecessary maintenance costs as compared to another widget | of its type that does what I want in a lightweight manner. | When I troubleshoot something of React's complexity, even my | prompts will likely need to be longer. | | But also, that's just one component of one component. And you | have to experience this stuff in the first place, to know | that you should pay attention to these details and not those | other ones, for a given job, for a given client, in a given | industry, with given specs. | | So, if I was able to wave my hands I'd simply have all the | problems I had back when I was a beginner. Ergo, it comes | back to the clip art problem: Being able to buy clip art | never made anyone a designer. But it made a lot of designers' | jobs way easier. | | We are simply regressing toward the mean with regard to | programming. It was never about computers in the first place, | never so concerned with syntax. | | Anyway, back to browsing my theater program... | SanderNL wrote: | Fair enough, but don't we abstract "upwards" all the time? | Assembly won't go away, but do you deal with it? | themodelplumber wrote: | For one, assembly ceases to be a relevant detail and is | replaced by other relevant details. | | So, I can't code fast games in a 1984 workplace, | currently, being too out of touch with assembly on a | given chipset. But I also can't wave my hands at an LLM | and expect a modern, fast game of the desired quality to | code itself. (Even though a clip art-style result is | possible, the requirements are always going to be special | details) | | The upwards direction example is also interesting because | it's foundational to the cognitive functionality of one | of the Jungian personality types. But other personality | perspectives also apply to coding, which means in part | that the directional, metaphorical-abstraction view can | effectively be a blind spot if we map it as the preferred | view on outcomes. | | The most common blind spot for this personality involves | questions of relevant details, and their intersection | with planning for yet-unknowns. There is a tendency to | hand-wave which ends up being similar to prophetic | behavior. Jung called this the "voice in the wilderness" | noting that it can easily detach from sensibility | (rationality) by departing from life details. Kind of | interesting stuff. | | (Ni-dominant type) | SanderNL wrote: | Now you got me on the edge of my seat. What is this | personality type? | themodelplumber wrote: | Ni-dominant. It exists nowadays in various post-Jungian | models, many of which are really fascinating, having | fleshed it out a lot. | | The opposing function to Ni is Se, which creates a | dichotomy of planning/foreseeing vs. doing/performing. | The functions oscillate as a kind of duty cycle, so a lot | of sages out there have hobbies as musicians, stage | magicians, etc. | | This dichotomy also effectively shuts out detail memory | for context, dealing mostly with present vs. future. Even | nostalgia is often ignored on the daily. So a Ni-dom will | usually describe their memory as pattern-based, gestalt, | more vague or general, etc. | rootusrootus wrote: | I would like to subscribe to your newsletter. | | Even if approximately 75% of that sailed right over my | head. | themodelplumber wrote: | Best I can do is RSS! | SanderNL wrote: | I couldn't quite tell if you found a beautiful way to | insult me, but it is fascinating indeed. I _am_ hand | wavey and I understand its failure modes quite well, | unfortunately. It 's cool to talk about it at this level | of abstraction. | themodelplumber wrote: | No insult intended... I don't really know how much it | applies in your case, but since you really took on that | viewpoint, that's when the personality theory side of me | goes, "well if this is a favored viewpoint then there IS | this idea about the population that favors this | viewpoint" :-) And thoughts about GPT are generally | crafted from general personality positions, in the | absence of other relevant self-development experience. | | I agree, it's cool stuff | wpietri wrote: | Yeah, I think there's a "stone soup" effect going on with AI. | | It's the same sort of thing you see happening with the | customers of psychics. People often have poor awareness of how | much they're putting in to a conversation. Or it's a bit like | the way Tom Sawyer tricks other kids into painting the fence | for him. For me a lot of the magic here is in knowing what | questions to ask and when the answers aren't right. If you have | those skills, is pounding out the code that hard? | | The interesting part for me is not generating new bits of code, | but the long-term maintenance of a whole thing. A while back | there was a fashion for coding "wizards", things that would ask | some questions and then generate code for you. People were very | excited, as they saw it as lowering the barrier to entry. But | the fashion died out because it just pushed all the problems a | bit further down the road. Now you had novice developers trying | to understand and improve code they weren't competent to write. | | I suspect that in practice, anything a person can get a LLM to | wholly write is also something that could be turned into a | library or framework or service or no-code tool that they can | just use. That, basically, if the novelty is low enough that an | LLM can produce it, the novelty is low enough that there are | better options than writing the code from scratch over and | over. | baq wrote: | I mostly agree except one critical detail: LLMs are _the_ low | code /no code service. You literally tell them what you want | and if they're fine tuned on the problem domain, you're all | set. Microsoft demo'd the office 365 integration and if it | works half as well in practice they'll own the space as much | as they have in 1997. | wpietri wrote: | Maybe they will be, but that's not proven yet. We'll see! | If anything, the article we're looking at suggests that the | "tell them what you want" step is not obviously much less | rigorous or effortful than coding. Tuning could make the | difference, or it could be one of those things that | produces better demos than results. | harlanlewis wrote: | Great points (and after checking your user name, I've been | nodding my head to posts of yours for about a decade now). | | This is a bot tangential - your reference to stone soup is a | wonderful example of the information density possible with | natural language. And all the meaning and story behind the | phrase is accessible to LLMs. | | I'll have to start experimenting with idiom driven | development, especially when prompt golfing. | 62951413 wrote: | I believe the Model Driven Architecture fad | (https://en.wikipedia.org/wiki/Model-driven_architecture) is | a better analogy than wizards. Back then the holy grail of | complete round trip UML->code->UML didn't get practical | enough to justify the effort. | zackmorris wrote: | This is an amazing demonstration, but I'm worried that when this | goes mainstream, we'll inherit a ton of baggage from today's | programming. Specifically: | | * The tests are written in BDD style "it('should xyz')", which | programmers do in code like this for convenience. But if we're | automating their creation, then actual human-readable Cucumber | clauses would be more useful. Maybe the tests can be transpiled. | This isn't the AI's fault, but more of a symptom of how the | original spirit of BDD as a means for nonprogrammers to test | business logic seems to have been lost. | | * React hooks and Redux syntax are somewhat contrived/derivative. | The underlying concepts like functional reactive programming and | reducers are great, but the syntax is often repetitive or | verbose, with a lot of boilerplate to accomplish things that | might be one-liners in other languages/frameworks. This is more | of a critique of the state of web programming than of the AI's | performance. | | * MVVM is a fine pattern, but at the end of the day, it's an | awful lot of handwaving to accomplish limited functionality. What | do I mean by that? Mainly that I question whether the frontend | needs models, routes, controllers (which I realize are MVC), etc. | I mourn that we lost the idempotent #nocode HTML of the 90s and | are back to manually writing app interfaces by hand in Javascript | (like we did for native desktop apps in the C++ OOP days) when | custom elements/components would have been so much easier. HTMX | combined with some kind of distributed serverless lambda | functions (that are actually as simple as they should be) would | reduce pages of code to a WYSIWYG document that nonprogrammers | could edit. | | What I'm really getting at is that I envisioned programming going | a different direction back in the late 90s. We got | GPUs/TensorFlow and Docker and WebAssembly and Rust etc etc etc. | And these things are all fine, but they're contrived/derivative | too. More formal systems might look like multicore/multimemory | transputers (or Lisp machines), native virtual machines with full | sandboxing built in so anything can run anywhere, immutable and | auto-parallelized languages like HigherOrderCO/HVM or true vector | processing with GNU Octave (MATLAB) so that we don't have to | manually manage vertex buffers or free memory, etc. | | I've had architectures in mind for better hardware and | programming languages for about 25 years (that's why I got my | computer engineering degree) but I will simply never have time to | implement them. All I do is work and cope. I just keep watching | as everyone reinvents the same imperative programming wheel over | and over again. And honestly it's gone on so long that I almost | don't even care anymore. It feels more appealing in middle age to | maybe just go be a hermit, get out of tech. I've always known | that someday I'd have to choose between programming and my life. | | Anyway, now that I'm way too old to begin the training, I wonder | if AI might help to rapidly prototype truly innovating tools. | Maybe more like J.A.R.V.I.S. where it's just on all of the time | and can iterate on ideas at a superhuman rate to assist humans in | their self-actualization. | | Then again, once we have that, it becomes trivial to implement | the stuff that I rant about. Maybe we only have about 5-10 years | until all of the problems are solved. I mean all of them, | everywhere, in physics/chemistry/biology/etc. Rather than | automating creative acts and play as AI is doing now. If the | Singularity arrives in 2030 instead of 2040, that also seems like | a strong incentive to go be a hermit. | | Does any of this resonate with anyone? That somehow everything | has gone terribly wrong, but it's more of a hiccup than a crisis? | That maybe the most impactful thing that any of us can do is.. | wait for things to get better? | davidthewatson wrote: | This is deeply resonant with me for the following reasons: | | 1) Age | | 2) BDD-style or what I call a madlib proxy for playing cucumber | on TV. Not a fan having used it in an RoR context I can only | call hipster-engineering, not what DHH described. | | 3) I just had the discussion on redux vs. datomic vs. riak with | friends yesterday. | | 4) Ditto the conversation on MVVM and the implied constraint | complexity of putting nodejs and chromium in the same | deployment package and calling it electron while carrying on | how simple it is relative to... a world where everything is | actually native all the way down? | | 5) Me too on the CASE era. | | 6) Cue Donald Knuth on literate programming. One thing that | cucumber is not, but I think taking another iteration at | literate programming in light of GPT or LLMs is a good idea | since Knuth is never wrong just 50 years ahead of time, but we | needed a collaboration of human-computer agents that is | patterned on a sensemaking protocol that can resolve subjective | truth by consensus of man and machine. How else could you | possibly resolve the fact that the SOTA lies to me on a daily | basis while defending itself and its lack of veracity with | force in what can only be seen as emulating the culture of | one's parents. | | 7) Yes, AI should help on the iterations. Those short design | sketch-to-demo we used to do at the design studio with sketch | on Monday and demo on Friday should be much easier today to go | from breakfast sketch to dinner demo, but I don't think they | are. The tooling is radically better but that better has come | at the cost of complexity and going sideways, neither of which | are being fully felt and accounted for reflectively, i.e. | they're not how you get to typing less and having the tools do | the work because when they break, the debugging is mind- | crushing. | | 8) I think the thing that's missing in the trivial part is that | it's not actually trivial, but particularly because the | software is the message and that insight stems from the fact | that software has emergent properties such as extensibility, | composability, and a resultant rate of change that make it very | difficult to compare from decade to decade because software's | fundamental disequilibrium stems from the fact that the full | stack is in constant flux from a mad hatter's pop culture where | we never sing the same song twice. There's value in theme and | variations if it can be modeled as improvisational human- | computer design pairing rather than yet another orchestration. | Joe Beda was as right about improvisation as Knuth is about the | art of computer programming. | | 9) I guess the t-shirt is: I'm not waiting... | | 10) In the immortal words of Raymond Loewy: Never leave well | enough alone. | | If there's a set of artifacts in software that achieve what I | hope for with AI, it's somewhere between Bret Victor and | https://iolanguage.org/ | romland wrote: | I started a bit of an exploration around prompts and code a week | or three back. I want to figure out the down/up-sides and create | tools for myself around it. | | So, for this project (a game), I decided "for fun" to try to not | write any code myself, and avoid narrow prompts that would just | feed me single functions for a very specific purpose. The LLM | should be responsible for this, not me! It's pretty painful since | I still have to debug and understand the potential garbage I was | given and after understanding what is wrong, get rid of it, and | change/add to the prompt to get new code. Very often completely | new code[1]. Rinse and repeat until I have what I need. | | The above is a contrived scenario, but it does give some | interesting insights. A nice one is that since here is one or | more prompts connected to all the code (and its commit), the | intention of the code is very well documented in natural | language. The commit history creates a rather nice story that I | would not normally get in a repository. | | Another thing is, getting an LLM (ChatGPT mostly) to fix a bug is | really hit and miss and mostly miss for me. Say, a buggy piece | comes from the LLM and I feel that this could almost be what I | need. I feed that back in with a hint or two and it's very rare | that it actually fixes something unless I am very very specific | (again, needing to read/understand the intention of the | solution). In many cases I, again, get completely new code back. | This, more than once, forced my hand to "cheat" and do human | changes or additions. | | Due to the nature of the contrived scenario, the code quality is | obviously suffering but I am looking forward to making the LLM | refactor/clean things up eventually. | | On occasion ChatGPT tells me it can't help me with my homework. | Which is interesting in itself. They are actually trying (but | failing) to prevent that. I am really curious how gimped their | models will be going forward. | | I've been programming for quite long. I've come to realize that I | don't need to be programming in the traditional sense. What I | like is creating. If that means I can massage an LLM to do a bit | of grunt work, I'm good with that. | | That said, it still often feels very much like programming, | though. | | [1] The completely new code issue can likely be alleviated by | tweaking transformers settings | | Edit: For the curious, the repo is here: | https://github.com/romland/llemmings and an example of a commit | from the other day: | https://github.com/romland/llemmings/commit/466babf420f617dd... - | I will push through and make it a playable game, after that, I'll | see. | celeritascelery wrote: | That is really interesting experiment! I have so many | questions. | | - do you feel like this could be a viable work model for real | projects? I recognize it will most likely be more effective to | balance LLM code with hand written code in the real world. | | - some of your prompts are really long. Do you feel like the | code you get out of the LLM is worth the effort you put in? | | - given that the code returned is often wrong, do you feel like | you could feasible for someone who knows little to no code? | | - it seems like you already know well all the technology behind | what you are building (I.e. you know how to write a game in | js). Do you think you could do this without already having that | background knowledge? | | - how many times do you have to refine a prompt before you get | something that is worth committing? | romland wrote: | I think it could be viable, even right now, with a big | caveat, you will want to do some "human" fixes in the code | (not just the glue between prompts). The downside of that is | you might miss out on parts of the nice natural language | story in the commit history. But the upside is you will save | a lot of time. | | Down the line you will be able to (cheaply) have LLMs know | about your entire code-base and at that point, it will | definitely become a pretty good option. | | On prompt-length, yeah, some of those prompts took a long | time to craft. The longer I spend on a prompt, the more | variations of the same code I have seen -- I probably get | impatient and biased and home in on the exact solution I want | to see instead of explaining myself better. When it's gone | that far, it's probably not worth it. Very often I should | probably also start over on the prompt as it probably can be | described differently. That said, if it was in the real world | and I was fine with going in and massaging the code fully, | quite some time could be saved. | | If you don't know how to code, I think it will be very hard. | You would at the very least need a lot more patience. But on | the flip side, you can ask for explanations of the code that | is returned and I must actually say that that is often pretty | good -- albeit very verbose in ChatGPT's case. I find it hard | to throw a real conclusion out there, but I can say that | domain knowledge will always help you. A lot. | | I think if you know javascript, you could easily make a game | even though you had never ever thought about making a game | before. The nice thing about that is that you will probably | not do any premature optimization at least :-) | | All in all, some prompts was nailed down on first try, the | simple particle system was one such example. Some other | prompts -- for instance the map-generation with Perlin noise | -- might be 50 attempts. | | A lot of small decisions are helpful, such as deciding | against any external dependencies. It's pretty dodgy to ask | for code around some that (e.g. some noise library) that you | need to fit into your project. I decided pretty early that | there should be no external dependencies at all and all | graphics would be procedurally generated. It has helped me as | I don't need to understand any libraries I have never used | before. | | Another note that is related to the above, there are upsides | and downsides with high-ish temperature is you get varying | results. I think I should probably change my behaviour around | that and possibly tweak it depending on how exact I feel my | prompt is. | | I find myself often wondering where the cap of today's LLM's | are, even if we go in the direction of multi-models and have | a base which does the reasoning -- and I have to say I keep | finding myself getting surprised. I think there is a good | possibility that this will be the way some kinds of | development will be. But, well, we'd need good local models | for that if we work on projects that might be of a sensitive | nature. | | Related to amount of prompt attempts: I think the game has | cost me around $6 in OpenAI fees so far. | | One particularly irritating (time consuming) prompt was | getting animated legs and feet: https://github.com/romland/ll | emmings/commit/e9852a353f89c217... | [deleted] | ChatGTP wrote: | Just curious, you're using which version? | romland wrote: | I have experimented quite a bit with various flavours of | LLaMa, but have had little success in actually getting not- | narrow outputs out of them. | | Most of the code in there now is generated by gpt-3.5-turbo. | Some commits are by GPT-4, and that is mostly due to context | length limitations. I have tried to put which LLM was used in | every non-human commit, but I might have missed it in some. | sk0g wrote: | That's a beautiful readme, starred! | | Out of curiosity, right now would you say you have saved time | by (almost) exclusively prompting instead of typing the code up | yourself? Do you see that trending in another direction as the | project progresses? | romland wrote: | It was far easier to get a big chunks of work done in the | beginning, but that is pretty much how it works for a human | too (at least for me). The thing that limit you is the | context-length limit of the LLM, so you have to be rather | picky on what existing code you feed back in. With this then | comes the issue with all the glue between the prompts, so I | can see that the more polished things will need to become, | the more human intervention -- this is a trend I already very | much see. | | If there is time saved, it is mostly because I don't fear | some upcoming grunt work. Say, for instance, creating the | "Builder" lemming. You know pretty much exactly how to do it | but you know there will be a lot of one-off errors and subtle | issues. It's easier to go at it by throwing together some | prompt a bit half-heartedly and see where it goes. | | On some prompts, several hours were spent, mostly reading and | debugging outputs from the LLM. This is where it eventually | gets a bit dubious -- I now know pretty much exactly how I | want the code to look since I have seen so many variants. I | might find myself massaging the prompt to narrow in on my | exact solution instead of making the LLM "understand the | problem". | | Much of this is due to the contrived situation (human should | write little code) -- in the real world you would just fix | the code instead of the prompt and save a lot of time. | | Thank you, by the way! I always find it scary to share links | to projects! :-) | sk0g wrote: | No worries, going to check out some of the commits when I | get a bit more free time as well. The concept is | intriguing! | | The usefulness of LLMs for engineering things is very hard | to gauge, and your project is going to be quite interesting | as you progress. No doubt they help with writing new | things, but I spend maybe ~15% of my time working on | something new, vs maintenance and extensions. The more | common activities are very infrequently demonstrated, | either the usefulness diminishes as the context required | grows, or they simply make for less exciting examples. | Though someone in my org has brought up an LLM tool that | tries to remedy bugs on the fly (at runtime), which sounds | absolutely horrific to me... | | It sounds similar to my experience with Copilot then. In | small, self-contained bits of code -- much more common in | new projects or microservices for example -- it can save a | lot of cookie cutter work. Sometimes it will get me 80% of | the way there, and I have to manually tweak it. Quite often | it produces complete garbage that I ignore. All that to | say, if I wasn't an SE, Copilot brings me no closer to | tackling anything beyond hello world. | | One big benefit though is with the simpler test cases. If I | start them with a "GIVEN ... WHEN ... THEN ..." comment, | the autocompletes for those can be terrific, requiring | maybe some alterations to suite my taste. I get positive | feedback in PRs and from people debugging the test cases | too, because the intention behind them is clear without | needing to guess the rationale for the test. Win win! | moonchrome wrote: | I feel like this is a bunch of ceremony and back and forth, and | also considering GPT-4 speed - I feel like I would fly past this | approach just using copilot and coding. | | I look forward to offloading these kinds of tasks to LLMs but I'm | not seeing the value right now. Using them feels slow and | unsatisfying, need to triple check everything, specify everything | relevant for context. | | Also maybe it's just me but verbalizing requirements | unambiguously can often be harder than writing code for it. And | it's not fun. If GPT4 was GPT3.5 fast it would probably be a | completely different story. | mrbonner wrote: | I can't help but thinking: this is way more work than just code | it myself? Anybody has the same thought? | all2 wrote: | It depends how you use it. I've been using it to skip | boilerplate coding and get straight to the meaty bits. It took | me a few days to sketch out an application using ChatGPT to | handle the boilerplate, including dependency management | (python, poetry, etc.). | | I've had to handle the specific pieces of implementation | myself. Especially unit testing new pieces of code. When asked | to generate unit tests, it does ok, but it doesn't get the | spirit of the code (my intended purpose) and so I'm left | filling in a bunch of blanks. | [deleted] | greenhearth wrote: | Is it me or does this just create a bunch of extra steps and | gratuitous complexity? These tools are not so efficient or make | anything easier, it seems. I'm sorry to the enthusiasts here - I | am usually excited about AI and a student of Computational | Linguistics, but I think this emperor is naked. | wudangmonk wrote: | I have been trying to find a use case for these LLMs and I | continue to keep an eye just in case someone figures out a way | to use them that I find useful in my workflow. My only use for | them so far is as an explorative tool for tasks I'm not | familiar such as when having to work with programming languages | I never use. For such things its great as not only do I not | have to go digging through the documentation, I also do not | have to then search the web for examples on how its actually | used. | | This is taking into account that I have removed the cost of | using it as much as possible since I do not have to switch to a | browser tab, asks my question, wait for reply and then copy any | useful text to my editor. I have it setup as a function call | inside my repl along with saved history to a local file in case | I need it. | | Even with this convenient way of using it I notice that pretty | much the only time I use it when working on my actual projecs | is just to save me the trouble of doing a google search for | trivial things such as looking up word definitions/synonyms for | naming things or for anything else where I would expect to find | the answer with just a bit of googling. I can just quickly do | my request and continue with whatever I was doing and then | return for my answer later. | mov_eax_ecx wrote: | How to overengineer with an LLM, don't state clearly the | requirements, shove your pet patterns first, it is more important | to follow the slice redux awareness hook than to have working | solution, never trust your developers to make decisions, worry | more how it is built than building a solution. | | My way to work with an LLM is to have a good, clear requirement | and make the LLM write a possible file organization and query the | contents of each file, just the code no comments and assemble a | working prototype fast, then you can iterate over the | requirements and evolve from there. | lyjackal wrote: | Generally, I agree that approach works well. It's going to | perform better if it's not trying to fulfill your teams | existing patterns. On the other hand, allowing lots of | inconsistencies in style in your large code base seems like a | quick way to create a hot mess. Chat prompts seem like a really | difficult way to communicate code style and conventions though. | A sibling comment to yours mentions that a copilot autocomplete | seems like a much better pattern for working in an existing | code base, and I tend to agree that's much more promising. Read | the existing code, and recommend small pieces as you type | moonchrome wrote: | How often do you get working code that way ? Unless it's | something trivial that fits in it's scope I'd say that's going | to produce garbage. I've seen it steer into garbage on longer | prompt chains about a single class (of medium complexity) - I | doubt it would work project level. Mind sharing the projects ? | mov_eax_ecx wrote: | I work only with closed source codebases and this approach | for prototypes, but, using the same example as the blog i | prompt: "the current system is an online whiteboard system. | Tech stack: react, use some test framework, use konva for the | canvas, propose a file organization, print the file layout | tree. (without explanations)." The trick is that for every | chat the context is the requirement+the filesystem + the | specific file, so you don't have the entire codebase in the | context, only the current file, also use gpt4, gpt3 is not | good enough. | | My main point is that the blog post final output is mock test | awareness hook redux, where an architect feels good to see | his patterns, with my approach you have a prototype online | whiteboard system, | blatant303 wrote: | Is there a tool that allows to do this within a text editor (for | instance VS Code). Using selection instead of copy pasting. | Having the LLM store its output directly within local files. | Maybe giving it access to a shell to run the tests on its own ? | mxuribe wrote: | If there isn't a tool currently...then give it time, and | eventually there will be tools similar to what you dewscribed. | I'd guess they'll be called something like prompt editors (like | text editors, etc.). ...Or, maybe they'll be called | chatditors...no, no, prompt editors is better. ;-) | xenospn wrote: | Copilot X will include that capability. | amelius wrote: | What I want is a prompt that continuously copies whatever I'm | doing, so I can ask to complete the task. | | For example, say I'm converting all identifiers in a file from | lowercase to CamelCase. Then after doing like 3 of them, I can | ask the LLM to take over and do the remainder. | chrisco255 wrote: | I mean, that kind of task is more than easy to do today. You | could probably just create a VS Code extension that you type | "convert all identifiers in this file that match this pattern | from lowercase to camel case" and pipe that to the GPT API to | instantly do it (without even needing to give it the first 3 | examples). | amelius wrote: | Sometimes just doing stuff takes less energy than thinking | about how it can be automated. | clarge1120 wrote: | Great example of how a GPT can reason on your behalf and | dramatically improve your performance. For instance, it could | watch for inconsistent approaches to design or even continue an | complex implementation you've started just from examining | context signals. | gdubs wrote: | There's an unfortunately common take on AI that goes basically | like this: | | "I tried it and it didn't do what I wanted, not impressed." | | My suggestion is to tune out the noise and really try | experimenting with these tools - and know that they're rapidly | improving. Even if ultimately you have criticisms or decide one | way or another, at least really investigate them for your own | use-cases rather than jumping on a bandwagon that's either "AI is | bad" or the breathless hype-machine at the other end. | mise_en_place wrote: | I was very impressed when it showed me the different techniques | for deep reinforcement learning. However, where it struggles is | when building an agent. Because you will need a high amount of | tokens to template a prompt, in the case of langchain or | AutoGPT. | rootusrootus wrote: | I agree it's a good idea to take a moderate approach. The hype | that LLMs are going to replace SWEs is clearly just that, hype, | if you've done any real work trying to get GPT4 to give you the | code you want. But it's also clearly a very useful tool. I | think it'll absolutely destroy Stack Overflow. | z3c0 wrote: | I am very critical of the LLM hype, but the threat to | stackoverflow is evident. Like stackoverflow, I never write | code verbatim that comes from even GPT4. I frequently find | issues in the output, as the code I write is generally very | context-specific. However, I find the back-and-forth with | interesting tidbits of info dropped here-and-there amounts to | something like rubber duck debugging on steroids. | [deleted] | tarruda wrote: | > The hype that LLMs are going to replace SWEs is clearly | just that, hype | | LLMs cannot replace anyone, but it is clear that engineers | which master LLMs usage might multiply their productivity by | a lot. | | The question is: If one LLM assisted engineer can work 10x | faster, will companies reduce their engineer staff by 90%? | majormajor wrote: | I've worked at far more companies with miles of product | idea backlog we never get to than ones with nothing for | engineering to do. | | Now product will be able to use an LLM to come up with | feature proposals and design docs even faster! :o | | So: are you working at a company where engineering is a | cost center or a revenue center? The latter wants to get | more done at the same cost _much_ more than it wants to | just cut spend. | nuancebydefault wrote: | To answer your question with a question if I may -- when | did productivity increase in software ever result in | headcount reduction? The competition also will have similar | productivity gain. | MacsHeadroom wrote: | >when did productivity increase in software ever result | in headcount reduction? The competition also will have | similar productivity gain. | | The average AI company has like 1 employee per $25M | valuation. That's around 25x fewer employees than the | typical tech company. | drowsspa wrote: | Yet, the whole movement of getting blue collar workers to | code seems to have lost its steam. | gumballindie wrote: | Probably because "graduating" bootcamps doesnt make one a | swe and people figured out it's a scam? | peterashford wrote: | Of course, there's the issue that a lot of the info for | useful LLMs probably comes from places like Stack Overflow | lcnPylGDnU4H9OF wrote: | > destroy Stack Overflow | | It'll be interesting to see how future training data is | sourced. | rootusrootus wrote: | Github would be my first guess. | lcnPylGDnU4H9OF wrote: | That does seem like a likely option. Discussions on | issues alongside the actual working (and not working) | code. | [deleted] | svachalek wrote: | You simply need the system to train itself on its own | interactions, like how search engines improve results by | counting clicks. | lcnPylGDnU4H9OF wrote: | I'm not wondering about how the system will determine | what's most helpful but instead determining what's even | "correct". A model will learn what's "correct" from Stack | Overflow by finding accepted or highly-voted answers but | when it can't find such content anymore (in this case | because Stack Overflow is hypothetically gone) then what | would even exist to generate these discussions to be used | as training data? | | Github, per the sibling comment, is a good example | because projects will have issues (tied to the individual | repository of source code to be seen as a working | implementation of the idea) which will be where such | discussions happen. | LawTalkingGuy wrote: | Those topics that AI replaces the forums for won't need | discussion. People won't be confused about that thing | because the coding AI knows the details of it. Soon | that'll be most syntax questions, soon simple to mid- | level algorithms, etc. | | People will move on to higher-level questions. | ok_dad wrote: | When Google search became important, people structured | their information so that Google could best index it. | When AIs become important in the same way, people will | start to structure their information so that a particular | class of AI can best index it. If that involves API | documentation, perhaps there will be a standard format | that AIs understand the best. | spaceman_2020 wrote: | People also forget that the model is trained on older data. At | first, it will default to referencing out of date frameworks | and solutions, but if you tell it that its code isn't working, | it will usually correct itself. | alexashka wrote: | You may be underestimating how much meaning people derive | _from_ jumping on bandwagons and having a simple to understand | group identity. | | Your suggestion would make many people unhappy. They can't win | the competence game and hence 'really investigating' is a | losing proposition for them. What they _can_ do is jump on | bandwagons very quickly, hoping to score a first mover | advantage. | | How much of an advantage would one get from taking a couple of | years to _really_ investigate Bitcoin and the algorithms | involved, vs buying some as early as possible and telling | everyone else how great it is? :) | joseph_grobbles wrote: | [dead] ___________________________________________________________________ (page generated 2023-04-18 23:00 UTC)