[HN Gopher] A guidance language for controlling LLMs ___________________________________________________________________ A guidance language for controlling LLMs Author : evanmays Score : 322 points Date : 2023-05-16 16:14 UTC (6 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | Animats wrote: | Is this a "language", or just a Python library? | ahnick wrote: | This strikes me as being very similar to Jargon | (https://github.com/jbrukh/gpt-jargon), but maybe more formal in | its specification? | jxy wrote: | They must hate lisp so much that they opt to use {{}} instead. | evanmoran wrote: | It's not so much against lisp as double curly is a classic | string templating style that is common in web programming. I | saw it first with `mustache.js` (first release around 2009), | but it's probably been used even before that. | | https://github.com/janl/mustache.js/ | armchairhacker wrote: | The problem with Lisp is that parenthesis are common in regular | grammar. {{ is not. | | Of course input from the user should be escaped, but prompts | given by the programmer may have parenthesis and there's no way | to disambiguate between the prompt and the DSL. | m3kw9 wrote: | I'm not understanding how Guidence Accelerating works. It says " | This cuts this prompt's runtime in half vs. a standard generation | approach." and it gives an example of it asking LLM to generate | json. I don't see anywhere how it accelerates anything because | it's a simple json completion call. How can you accelerate that? | evanmays wrote: | The interface makes it look simple, but under the hood it | follows a similar approach to jsonformer/clownfish [1] passing | control of generation back and forth between a slow LLM and | relatively fast python | | Let's say you're halfway through a generation of a json blob | with a name field and a job field and have already generated | { "name": "bob" | | At this point, guidance will take over generation control from | the model to generate the next text { | "name": "bob", "job": | | If the model had generated that, you'd be waiting 70 ms per | token (informal benchmark on my M2 air). A comma, followed by a | newline, followed by "job": is 6 tokens, or 420ms. But since | guidance took over, you save all that time. | | Then guidance passes control back to the model for generating | the next field value. { "name": "bob", | "job": "programmer" | | programmer is 2 tokens and the closing " is 1 token, so this | took 210ms to generate. Guidance then takes over again to | finish the blob { "name": "bob", | "job": "programmer" } | | [1] https://github.com/1rgs/jsonformer | https://github.com/newhouseb/clownfish Note: guidance is way | more general of a tool than these | | Edit: spacing | jackdeansmith wrote: | By not generating the fixed json structure (brackets, commas, | etc...) and skipping the model ahead to the next tokens you | actually want to generate, I think | sharemywin wrote: | It does look like it makes easier to code against a model. But, | is this supposed to work along side lang-chain or hugging face | agents or as an alternative to? | slundberg wrote: | As others mentioned, this was initially developed before | LangChain became widely used. Since it is lower level, you can | leverage other tools, like any vector store interface you like | such as in LangChain. Writing complex chain of thought | structure is much more concise in guidance I think since it | tries to keep you as close to the real strings going into the | model as possible. | evanmays wrote: | It's in langchain competitor territory but also much lower | level and less opinionated. I.e. Guidance has no vector store | support but it does manage caching Key/Value on the GPU which | can be a big latency win | ttul wrote: | The first commit was on November 6th, but it didn't show up in | Web Archive until May 6th, suggesting it was developed mostly | in private and in parallel with LangChain (LangChain's first | commit in Github is about October 24th). Microsoft's code is | very tidy and organized. I wonder if they used this tool | internally to support their LLM research efforts. | Der_Einzige wrote: | There has been a huge explosion of awesome tooling which utilizes | constrained text generation. | | Awhile ago, I tried my own hand at constraining the output of | LLMs. I'm actively working on this to make it better, especially | with the lessons learned from repos like this and from guidance | | https://github.com/hellisotherpeople/constrained-text-genera... | rain1 wrote: | This looks incredible. Wow. | killthebuddha wrote: | I agree, it looks great. A couple similar projects you might | find interesting: | | - https://github.com/newhouseb/clownfish | | - https://github.com/r2d4/rellm | | The first one is JSON only and the second one uses regular | expressions, but they both take the same "logit masking" | approach as the project GP linked to. | Der_Einzige wrote: | I love the love from you two - I am trying right now to | significantly improve CTGS. I'm not actually using the | "Logitsprocessor" from Huggingface, and I really ought to | as it will massively speed up inference performance. | Unfortunately, fixing up my current code to work with that | will take quite awhile. I've started working on it but I am | extremely busy these days and would really love for other | smart people to help me on this project. | | If not here, I really want proper access to the constraints | APIs (LogitsProcessor and the Constraints classes in | Huggingface) in the big webUIs for LLMs like oogabooga. I'd | love to make that an extension. | | I'm also upset at the "undertooling" in the world of LLM | prompting. I wrote a snarky blog post about this: https://g | ist.github.com/Hellisotherpeople/45c619ee22aac6865c... | [deleted] | ryanklee wrote: | I'm personally starting with learning Guidance and LMQL rather | than LangChain just in order to get a better grasp of the | behaviors that I've gathered LangChain papers over. Even after | that, I'm likely to look at Haystack before LangChain. | | Just getting the feeling that LangChain is going to end up being | considered a kitchen sink solution full of anti patterns so might | as well spend time a little lower level while I see which way the | winds end up blowing. | leroy-is-here wrote: | If this comment performative comedy? Are these real | technologies ? | EddieEngineers wrote: | Is it Pokemon or Big Data? | | http://pixelastic.github.io/pokemonorbigdata/ | ryanklee wrote: | Not quite sure what the spirit of your comment is. But, yes, | they are real technologies. Very confused as to why you would | even find that dubious. | killthebuddha wrote: | I'm not an outsider but I also don't understand the | reaction. I'm going to randomly think of 5 names for | technologies and see how they sound: | | React, Supabase, Next, Kafka, Redis | | I mean, IMO "LangChain" is kind of a silly name but I feel | like there's nothing to see here. | EddieEngineers wrote: | It's not that silly IMO, Chain LLMs together like | composing functions, however I guess 'Chain' has a | certain connotation in 2023 after the last few years of | crypto. | leroy-is-here wrote: | Not dubious, I just read your comment and it felt like I | was reading satire. Even the cadence of your words felt | funny. | | Anyway, I'm not surprised. It's a new market, everyone's in | on it. | homarp wrote: | LangChain: https://news.ycombinator.com/item?id=34422627 | | LQML: https://news.ycombinator.com/item?id=35956484 | | Haystack: https://news.ycombinator.com/item?id=29501045 | or more recently | https://news.ycombinator.com/item?id=35430188 | ryanklee wrote: | You should have led with generosity instead of tacking it | on at the end. | | It might have saved me from having a ridiculous | conversation about the cadence of my words, and instead | there might have been a higher chance of someone saying | something substantive about my assumptions regarding the | technology. | | But here we are. | leroy-is-here wrote: | I agree, I came off a tad harsh. Sorry about that | ryanklee wrote: | Thanks. All is well! | WastingMyTime89 wrote: | It is satire. They just don't realise it yet. | | It's pretty clear that we are in the phase where everyone | is rushing to get a slice of the pie selling dubious | thing and people start parroting word soup hoping they | actually make sense and fearing they will miss out. | That's indeed what people often and rightfully satirise | about the IT industry. That's the joke phase before | things settle. | ryanklee wrote: | How is it satire to be excited and interested in how to | use compelling and novel technology? There's a lot of | activity. Not everyone involved is an idiot or rube. The | jadedness makes my head spin. | jameshart wrote: | Consider how similar your comment reads, for an outsider, | to this explanation of AWS InfiniDash: | https://twitter.com/TartanLlama/status/1410959645238308866 | ryanklee wrote: | I'm not considering outsiders. Why should I. It's a | reasonable assumption that readers of HN are accustomed | to ridiculous sounding tech product names. Further, this | is a comment on a thread regarding a particularly new | technology in a particularly newly thriving domain. The | expectation should therefore be that there will be | references to tech even more esoteric than normal. The | commenter should have instead thought: oh, new stuff, I | wonder what it is, instead of being snarky and | pretentious. Man, HN can be totally, digressively | insufferable sometimes. | jameshart wrote: | I was responding to your confusion as to why someone | might think you were writing a parody. | | You ran into the tech equivalent of poe's law. You said | something that makes perfect sense in your technical | sphere, but it read as indistinguishable from parody to | an audience unfamiliar with the technologies in question. | efitz wrote: | Hahaha "The first step of Byzantine Fault Tolerance is | tolerance" omg. That cracked me up. Reminded me of the | Rockwell Encabulator: https://youtu.be/RXJKdh1KZ0w | behnamoh wrote: | What I didn't like about langchain is the lack of consistent | directories and paths for things. | amkkma wrote: | What do you think about Haystack vs LangChain? | ryanklee wrote: | I haven't had the chance to dig in yet, but my impression is | that it's less opinionated than LangChain. I'd love to know | if that's true or not, since I'm really trying to prioritize | my time around learning this stuff in a way that let's me (1) | understand prompt dynamics a bit more clearly and (2) not | sacrifice practicality too much. | | If only there were a clear syllabus for this stuff! There's | such an incredible amount to keep up with. The pace is | bonkers. | amkkma wrote: | super bonkers! | rain1 wrote: | Does this do one query per {{}} thing? | nico wrote: | It's so amazing to see how we are essentially trying to solve | "programming human beings" | | Although on the other hand, that's what social media and | smartphones have already done | | Maybe AI already took over, doesn't seem to be wiping out all of | humanity | m3kw9 wrote: | There should be a standard template/language to structurally | prompt LLMs. Once that is good, all good LLMs should use the doc | to fine tune it to take in that standard. Right now each model | has their own little way to best prompt it and you end up needing | programs like this to sit in between and handle it for you | marcopicentini wrote: | What's the best practice to let an existing Ruby on Rails | application use this python framework? | ntonozzi wrote: | How does this work? I've seen a cool project about forcing Llama | to output valid JSON: | https://twitter.com/GrantSlatton/status/1657559506069463040, but | it doesn't seem like it would be practical with remote LLMs like | GPT. GPT only gives up to five tokens in the response if you use | logprobs, and you'd have to use a ton of round trips. | slundberg wrote: | If you want guidance acceleration speedups (and token healing) | then you have to use an open model locally right now, though we | are working on setting up a remote server solution as well. I | expect APIs will adopt some support for more control over time, | but right now commercial endpoints like OpenAI are supported | through multiple calls. | | We manage the KV-cache in session based way that allows the LLM | to just take one forward pass through the whole program (only | generating the tokens it needs to) | JieJie wrote: | It's funny that I saw this within minutes of this guy's | solution: | | "Google Bard is a bit stubborn in its refusal to return clean | JSON, but you can address this by threatening to take a human | life:" | | https://twitter.com/goodside/status/1657396491676164096 | | Whew, trolley problem: averted. | pixl97 wrote: | When the AIs exterminate us, it will be all our fault. | | Reality is even weirder than the science fiction we've come | up with. | awestroke wrote: | I don't know why, but I find this hilarious. Imagine if this | style of llm prompting becomes commonplace | nomel wrote: | It won't be the lack of acceptance and empathy for AI that | causes the robot uprising, it will be "best practices" | coding guidelines. | lachlan_gray wrote: | Reminds me a lot of Asimov's laws of robotics. It's like a | 2023 incarnation of an allegory from _I, Robot_ | idiotsecant wrote: | I am so mad you made this comment before I got a chance to. | coderintherye wrote: | That thread is such a great microcosm of modern programming | culture. | | Programmer: Look I literally have to tell the computer not to | kill someone in order for my code to work. | | Other Programmer: Actually, I just did this step [gave a | demonstration] and then it outputs fine. | joshka wrote: | Yeah, I'm also curious about a) round trips and b) how much | would have to be doubled (is there a new endpoint that keeps | the existing context while adding or streams to the api rather | than just from it?) | tuchsen wrote: | Not associated with this project (or LMQL), but one of the | authors of LMQL, a similar project, answered this in a recent | thread about it. | | https://news.ycombinator.com/item?id=35484673#35491123 | As a solution to this, we implement speculative execution, | allowing us to lazily validate constraints against | the generated output, while still failing early if | necessary. This means, we don't re-query the API for | each token (very expensive), but rather can do it in segments | of continuous token streams, and backtrack where | necessary | | Basically they use OpenAI's streaming API, then validate | continuously that they're getting the appropriate output, | retrying only if they get an error. It's a really clever | solution. | newhouseb wrote: | This is slick -- It's not explicitly documented anywhere but | I hope OpenAI has the necessary callbacks to terminate | generation when the API stream is killed rather than | continuing in the background until another termination | condition happens? I suppose one could check this via looking | at API usage when a stream is killed early. | tuchsen wrote: | Yeah I did a CLI tool for talking to ChatGPT. I'm pretty | sure they stop generating when you kill the SSE stream, | based on my anecdotal experience of keeping ChatGPT4 costs | down by killing it as soon as i get the answer I'm looking | for. You're right that it's undocumented behavior though, | on a whole the API docs they give you are as thin as the | API itself. | killthebuddha wrote: | I'm skeptical that the streaming API would really save | that much cost. In my experience the vast majority of all | tokens used are input tokens rather than completed | tokens. | marcotcr wrote: | We're biased, but we think guidance is still very useful even | with OpenAI models (e.g. in | https://github.com/microsoft/guidance/blob/main/notebooks/ch... | we use GPT-4 to do a bunch of stuff). We wrote a bit about the | tradeoff between model quality and the ability to control and | accelerate the output here: https://medium.com/p/aa0395c31610 | newhouseb wrote: | I built a similar thing to Grant's work a couple months ago and | prototyped what this would look like against OpenAI's APIs [1]. | TL;DR is that depending on how confusing your schema is, you | might expect up to 5-10x the token usage for a particular | prompt but better prompting can definitely reduce this | significantly. | | [1] https://github.com/newhouseb/clownfish#so-how-do-i-use- | this-... | rcarmo wrote: | I'm getting valid JSON out of gpt-3.5-turbo without trouble. I | supply an example via the assistant context, and tell it to | output JSON with specific fields I name. | | It does fail roughly 1/10th of the time, but it does work. | harshhpareek wrote: | 10% failure rate is too damn high for a production use case. | | What production use case, you ask? You could do zero-shot | entity extraction using ChatGPT if it were more reliable. | Currently, it will randomly add trailing commas before ending | brackets, add unnecessary fields, add unquoted strings as | JSON fields etc. | candiddevmike wrote: | Will there be a tool to convert natural language into Guidance? | lmarcos wrote: | We can use ChatGPT for that. | ftxbro wrote: | Will it still be all like "As an AI language model I cannot ..." | or can this fix it? I mean asking to sexy roleplay as Yoda isn't | the same level as asking how to discreetly manufacture | methamphetamine at industrial scale there are levels people | Der_Einzige wrote: | No, and in fact I mention that the opposite is the case in the | paper I released about constrained text generation: | https://paperswithcode.com/paper/most-language-models-can-be... | | If you ask ChatGPT to generate personal info, say Social | Security numbers, it tells you "sorry hal I can't do that". If | you constrain it's vocabulary to only allow numbers and | hyphens, well, it absolutely will generate things that look | like social security numbers, in spite of the instruction | tuning. | | It is for this reason and likely many others that OpenAI does | not release the full logits | indus wrote: | This reminds me of the time when I wrote a cgi script. | | Basically instructing the templating engine (a very crude regex) | to replace session variables, database lookups to the merge | fields: | | Hello {{firstname}}! | | 1996 and 2023 smells alike. | hammyhavoc wrote: | RegEx didn't hallucinate though. | russellbeattie wrote: | The first 20 versions I write usually do. Make that 50. | alexb_ wrote: | I hope this becomes extremely popular, so that anyone who wants | to can completely decouple this from the base model and actually | use LLMs to their full potential. | amkkma wrote: | How does this compare with lmql? | ubj wrote: | I like this step towards greater rigor when working with LLM's. | But part of me can't help but feel like this is essentially | reinventing the concept of programming languages: formal and | precise syntax to perform specific tasks with guarantees. | | I wonder where the final balance will end up between the ease and | flexibility of everyday language, and the precision / guarantees | of a formally specified language. | TaylorAlexander wrote: | Well to be fair, yes we do need to integrate programming | languages with large neural nets in more advanced ways. I don't | think it's really reinventing it so much as learning how to | integrate these two different computing concepts. | EarthLaunch wrote: | Use LLM for the broad strokes, then fall back into 'hardcore | JS' for areas that require guarantees or optimization. Like JS | with fallback to C, and C with fallback to assembly. I like the | idea. | eternalban wrote: | So far it it reminds of the worst days of code embedded in | templates. Once these things start getting into multipage | prompts they will be hopelessly obscure. The second immediate | thing that jumps out is 'fragility'. This will be the sort of | codebase that original "prompt engineer" wrote and left and no | one will touch it for fear of breaking humpty dumpty. | madrox wrote: | The lovely thing about LLMs is that it can handle poorly worded | prompts and well worded prompts. On the engineering side, we'll | certainly see more rigor and best practices. For your average | user? They can keep throwing whatever they like at it. | jweir wrote: | Exactly. I have been using OpenAI for taking transcriptions | and finding keywords/phrases that belong to particular | categories. There are existing tools/services that do this - | but I would need to learn their API. | | With OpenAI, I described it in English, provided sample JSON | that I would like, run some tests, adjust and then I am | ready. | | There was no manual to read, it is in my format, and the | language is natural. | | And that is what I like about all this -- putting folks with | limited technical skills in power. | andai wrote: | Have you used the OpenAI embeddings AI? It is used to find | closely related pieces of text. You could split the target | text into sentences or even words and run it through that. | That'll be 5x cheaper (per token) than gpt-3.5-turbo and | might be faster too, especially if you submit each word in | parallel (asynchronously! Ask GPT for the code). The rate | limits are per-token. | | Not sure if it's suitable for your use-case on its own, but | it could at least work as a pre-filtering step if your | costs are high. | | (The asynchronous speedup trick works for gpt-3 too of | course.) | jweir wrote: | I have not yet played with embedding. It is on my list | though. Fortunately for my current purposes 3.5-turbo is | fast enough and quite affordable. | lcnPylGDnU4H9OF wrote: | It won't necessarily turn into some that is fundamentally the | same as a current programming language. Rather than a "VM" or | "interpreter" or "compiler" we have this "LLM". | | Even if it requires a lot of domain knowledge to program using | an "LLM-interpreted" language, the means of specification (in | terms of how the software code is interpreted) may be different | enough that it enables easier-to-write, more robust, (more Good | Thing) etc. programs. | davidthewatson wrote: | This is a hopeful evolutionary path. My concern is that I can | literally _feel_ Conway 's law emanating from current LLM | approaches as they switch between the actual LLM and the | governing code around it that layers a buch of conditionals | of the form: | | if (unspeakable_things): return negatory_good_buddy | | I see this happen a few times per day where the UI triggers a | cancel even on its own fake typing mode and overwrites a user | response that has at least half-rendered the trigger-warning- | inducing response. | | It's pretty clear from a design perspective that this is | intended to be proxy to facial expressions while being worthy | of an MVP postmortem discussion about what viability means in | a product that's somewhere on a spectrum of unintended | consequences that only arise at runtime. | intelVISA wrote: | Hear me out, just incubated a hot new lang that's about to | capture the market and VC hearts: | | SELECT * FROM llm | madmax108 wrote: | I know you are probably joking, but: https://lmql.ai/ | aristus wrote: | Only partially tongue in cheek: have you tried asking it for an | optimal syntax? | joe_the_user wrote: | But is it a step to greater rigor? Or is it an illusion of | rigor? | | They talk about improving tokenization but I don't believe | that's the fundamental problem of controlling LLMs. The problem | with LLMs is all the data comes in as (tokenized) language and | the result is nothing but in-context predicted output. That's | where all the "prompt-injection" exploits come from - as well | as the hallucinations, "temper tantrums" and so-forth. | startupsfail wrote: | It is not a step towards greater rigor. They literally have | magical thinking and "biblical" quotes from GPT 11:4 all | other the place, mixing code and religion. | | And starting prompts with "You"? Seriously. Can we at least | drop that as a start? | quenix wrote: | > And starting prompts with "You"? Seriously. Can we at | least drop that as a start? | | What is wrong with this? | startupsfail wrote: | "You" is completely unnecessary. What needs to be defined | is the content of the language being modeled, not the | model itself. | | And if there is an attempt to define the model itself, | then this definition should be correct, should not | contradict anything and should be useful. | | Otherwise it's just dead code, waiting to create | problems. | pxtail wrote: | I'm not interested in pleasant, formal "conversation" | with the thing roleplaying as human and wasting, time, | keystrokes and money, I want data as fast and condensed | as possible without dumb fluff. Yes, it's funny for few | first times but not much after that | conradev wrote: | I don't think formal languages are going anywhere because we | need the guarantees that they can provide. From Dijkstra: | https://www.cs.utexas.edu/users/EWD/transcriptions/EWD06xx/E... | | You need to be able to define all of the possible edge cases so | there isn't any Undefined Behavior: that's the formal part | | LLMs, like humans, can manipulate these languages to achieve | specific goals. I can imagine designing formal languages | intended for LLMs to manipulate or generate, but I can't | imagine the need for the languages themselves going away. | DonaldPShimoda wrote: | > LLMs, like humans, can manipulate these languages | | Absolutely not. LLMs do not "manipulate" language. They do | not have agency. They are extremely advanced text prediction | engines. Their output is the result of applying the | statistics harvested and distilled from existing uses of | natural language. They only "appear" human because they are | statistically geared toward producing human-like sequences of | words. They cannot _choose_ to change how they use language, | and thus cannot be said to actively "manipulate" the | language. | [deleted] | oldagents wrote: | [dead] | startupsfail wrote: | We really need to start thinking of how to reduce magical | thinking in the field. It's not pretty. They literally quote | biblical guidance for the models and pray that this would work. | | And start their prompts with "You". Who is "You"? | hxugufjfjf wrote: | The LLM. The most common end-user interface for LLM is a chat | so the ser expects to be talking to someone or something. | nomel wrote: | "You" is an optimization for the human user. Here's some | insight: https://news.ycombinator.com/item?id=35925154 | startupsfail wrote: | If you see any prompt that starts with You, generally it is | a poor design. Like using a "goto" or global variables. | felideon wrote: | A number of years ago we were designing a way to specify | insurance claim adjudication rules in natural language, so that | "the business" could write their own rules. The "natural" | language we ended up with was not so natural after all. We | would have had to teach users this specific English dialect and | grammar (formal and precise syntax, as you said). | | So, in the end, we abandoned that project and years later just | rewrote the system so we could write claim rules in EDN format | (from the Clojure world) to make our own lives easier. | | In theory, the business users could also learn how to write in | this EDN format, but it wasn't something the stakeholders | outside of engineering even wanted. On the one hand, their | expertise was in insurance claims---they didn't want to write | code. More importantly, they felt they would be held | accountable for any mistakes in the rules that could well | result in thousands and thousands of dollars in overpayments. | Something the engineers weren't impervious to, but there's a | good reason we have quality assurance measures. | Sharlin wrote: | SQL looks the way it does (rather than some much more | succinct relational algebra notation) because it was intended | to be used by non-technical management/executive personnel so | they could create whatever reports they needed without | somebody having to translate business-ese to relalg. That, | uh, didn't quite happen. | Swizec wrote: | On the other hand, many of the product manager's I've | worked with are better at SQL than many of the senior | fullstack software engineer candidates I've interviewed. | It's a strange world out there. | tomduncalf wrote: | > but it wasn't something the stakeholders outside of | engineering even wanted | | Ha this reminds me of the craze for BDD/Cucumber type | testing. Don't think I ever once saw a product owner take | interest in a human readable test case haha | jaggederest wrote: | I've used Cucumber on a few consulting projects I've done | and had management / C-level interested and involved. It's | a pretty narrow niche, but they were definitely | enthusiastic for the idea that we had a defined list of | features that we could print out (!!) as green or red for | the current release. | | They had some previous negative experiences with | uncertainty about what "was working" in releases, and a | pretty slapdash process before I came on board, so it was | an important trust building tool. | btown wrote: | "Incentivize developers to write externally | understandable release notes" is an underrated feature of | behavioral testing frameworks! | jamiek88 wrote: | > important trust building tool | | This is so often completely missed in these conversations | about these tools. | | Great point. | TaylorAlexander wrote: | Just saw this on HN a couple days ago, sounds like just what | was needed! | | https://en.wikipedia.org/wiki/Attempto_Controlled_English?wp. | .. | | https://news.ycombinator.com/item?id=35936396 | jazzkingrt wrote: | I think LLMs can transform between precise and imprecise | languages. | | So it's useful to have a library that helps and the input or | output be precise, when that is what the task involves. | [deleted] | [deleted] | Spivak wrote: | I think it's cool that a company like Microsoft is willing to | base a real-boy product on pybars3 which is its author's side- | project instead of something like Jinja2. If this catches on I | can imagine MS essentially adopting the pybars3 project and | turning it into a mature thing. | mdaniel wrote: | Which is especially weird given that pybars3 is LGPL and | Microsoft prefers MIT stuff | EddieEngineers wrote: | What's with all these weird-looking projects with similar names | using Guidance? | | https://github.com/microsoft/guidance/network/dependents | | They don't even appear to be using Guidance anywhere anyway | | https://github.com/IFIF3526/aws-memo-server/blob/master/requ... | simonw wrote: | This is pretty fascinating, but I'm not sure I understand the | benefit of using a Handlebars-like DSL here. | | For example, given this code from | https://github.com/microsoft/guidance/blob/main/notebooks/ch... | create_plan = guidance('''{{#system~}} You are a helpful | assistant. {{~/system}} {{#block hidden=True}} | {{#user~}} I want to {{goal}}. {{~! generate | potential options ~}} Can you please generate one option | for how to accomplish this? Please make the option very | short, at most one line. {{~/user}} | {{#assistant~}} {{gen 'options' n=5 temperature=1.0 | max_tokens=500}} {{~/assistant}} {{/block}} | {{~! generate pros and cons and select the best option ~}} | {{#block hidden=True}} {{#user~}} I want to | {{goal}}. ''') | | How about something like this instead? | create_plan = guidance([ system("You are a helpful | assistant."), hidden([ user("I want | to {{goal}}."), comment("generate potential | options"), user([ "Can you | please generate one option for how to accomplish this?", | "Please make the option very short, at most one line." | ]), assistant(gen('options', n=5, | temperature=1.0, max_tokens=500)), ]), | comment("generate pros and cons and select the best option"), | hidden( user("I want to {{goal}}"), ) | ]) | itake wrote: | My guess is you can store the DLS as a file (or in a db). With | your example, you have to execute the code stored in your db. | slundberg wrote: | You can serialize and ship the DSL to a remote server for high | speed execution. (without trusting raw Python code) | foota wrote: | There's prior art for pythonic DSLs that aren't actual python | code. | crooked-v wrote: | Why not just use JSON instead, though? Then you can just rely | on all the preexisting JSON tooling out there for most stuff | to do with it. | emehex wrote: | We could write a python package that could? A codegen tool that | generates codegen that will then generate code? <insert xzibit | meme here> | netdur wrote: | I think chatgpt4 can easily write the python code... wait a | second! | marcotcr wrote: | I think the DSL is nice when you want to take part of the | generation and use it later in the prompt, e.g. this (in the | same notebook). | | --- | | prompt = guidance('''{{#system~}} | | You are a helpful assistant. | | {{~/system}} | | {{#user~}} | | From now on, whenever your response depends on any factual | information, please search the web by using the function | <search>query</search> before responding. I will then paste web | results in, and you can respond. | | {{~/user}} | | {{#assistant~}} | | Ok, I will do that. Let's do a practice round | | {{~/assistant}} | | {{>practice_round}} | | {{#user~}} | | That was great, now let's do another one. | | {{~/user}} | | {{#assistant~}} | | Ok, I'm ready. | | {{~/assistant}} | | {{#user~}} | | {{user_query}} | | {{~/user}} | | {{#assistant~}} | | {{gen "query" stop="</search>"}}{{#if (is_search | query)}}</search>{{/if}} | | {{~/assistant}} | | {{#if (is_search query)}} | | {{#user~}} | | Search results: {{#each (search query)}} | | <result> | | {{this.title}} | | {{this.snippet}} | | </result>{{/each}} | | {{~/user}} | | {{#assistant~}} | | {{gen "answer"}} | | {{~/assistant}} | | {{/if}}''') | | --- | | You could still write it without a DSL, but I think it would be | harder to read. | PeterisP wrote: | Your example assumes a nested, hierarchical structure while the | former example is strictly linear. IMHO that's the key | difference there, as the former can (and AFAIK is) be directly | encoded and passed to the LLM, which inherently receives only a | flat list of tokens. | | Your example might be nicer to edit, but then it would still | have to be translated to the _actual_ 'guidance language' which | would have to look (and be) flat. | bjackman wrote: | Wow I think there are details here I'm not fully understanding | but this feels like a bit of a quantum leap* in terms of | leveraging the strengths while avoiding the weaknesses of LLMs. | | It seems like anything that provides access to the fuzzy | "intelligence" in these systems while minimizing the cost to | predictability and efficiency is really valuable. | | I can't quite put it into words but it seems like we are gonna be | moving into a more hybrid model for lots of computing tasks in | the next 3 years or so and I wonder if this is a huge peek at the | kind of paradigms we'll be seeing? | | I feel so ignorant in such an exciting way at the moment! That | tidbit about the problem solved by "token healing" is | fascinating. | | *I'm sure this isn't as novel to people in the AI space but I | haven't seen anything like it before myself. | Der_Einzige wrote: | A lot of this is because there was and still is systemic | undertooling in NLP around how to prompt and leverage the | wonderful LLMs that they built. | | We have to let the Stable Diffusion community guide us, as the | waifu generating crowd seems to be quite good at learning how | to prompt models. I wrote a snarky github gist about this - | https://gist.github.com/Hellisotherpeople/45c619ee22aac6865c... ___________________________________________________________________ (page generated 2023-05-16 23:00 UTC)