hngopher.com

       [HN Gopher] A guidance language for controlling LLMs
       ___________________________________________________________________
        
       A guidance language for controlling LLMs
        
       Author : evanmays
       Score  : 322 points
       Date   : 2023-05-16 16:14 UTC (6 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | Animats wrote:
       | Is this a "language", or just a Python library?
        
       | ahnick wrote:
       | This strikes me as being very similar to Jargon
       | (https://github.com/jbrukh/gpt-jargon), but maybe more formal in
       | its specification?
        
       | jxy wrote:
       | They must hate lisp so much that they opt to use {{}} instead.
        
         | evanmoran wrote:
         | It's not so much against lisp as double curly is a classic
         | string templating style that is common in web programming. I
         | saw it first with `mustache.js` (first release around 2009),
         | but it's probably been used even before that.
         | 
         | https://github.com/janl/mustache.js/
        
         | armchairhacker wrote:
         | The problem with Lisp is that parenthesis are common in regular
         | grammar. {{ is not.
         | 
         | Of course input from the user should be escaped, but prompts
         | given by the programmer may have parenthesis and there's no way
         | to disambiguate between the prompt and the DSL.
        
       | m3kw9 wrote:
       | I'm not understanding how Guidence Accelerating works. It says "
       | This cuts this prompt's runtime in half vs. a standard generation
       | approach." and it gives an example of it asking LLM to generate
       | json. I don't see anywhere how it accelerates anything because
       | it's a simple json completion call. How can you accelerate that?
        
         | evanmays wrote:
         | The interface makes it look simple, but under the hood it
         | follows a similar approach to jsonformer/clownfish [1] passing
         | control of generation back and forth between a slow LLM and
         | relatively fast python
         | 
         | Let's say you're halfway through a generation of a json blob
         | with a name field and a job field and have already generated
         | {         "name": "bob"
         | 
         | At this point, guidance will take over generation control from
         | the model to generate the next text                 {
         | "name": "bob",         "job":
         | 
         | If the model had generated that, you'd be waiting 70 ms per
         | token (informal benchmark on my M2 air). A comma, followed by a
         | newline, followed by "job": is 6 tokens, or 420ms. But since
         | guidance took over, you save all that time.
         | 
         | Then guidance passes control back to the model for generating
         | the next field value.                 {         "name": "bob",
         | "job": "programmer"
         | 
         | programmer is 2 tokens and the closing " is 1 token, so this
         | took 210ms to generate. Guidance then takes over again to
         | finish the blob                 {         "name": "bob",
         | "job": "programmer"       }
         | 
         | [1] https://github.com/1rgs/jsonformer
         | https://github.com/newhouseb/clownfish Note: guidance is way
         | more general of a tool than these
         | 
         | Edit: spacing
        
         | jackdeansmith wrote:
         | By not generating the fixed json structure (brackets, commas,
         | etc...) and skipping the model ahead to the next tokens you
         | actually want to generate, I think
        
       | sharemywin wrote:
       | It does look like it makes easier to code against a model. But,
       | is this supposed to work along side lang-chain or hugging face
       | agents or as an alternative to?
        
         | slundberg wrote:
         | As others mentioned, this was initially developed before
         | LangChain became widely used. Since it is lower level, you can
         | leverage other tools, like any vector store interface you like
         | such as in LangChain. Writing complex chain of thought
         | structure is much more concise in guidance I think since it
         | tries to keep you as close to the real strings going into the
         | model as possible.
        
         | evanmays wrote:
         | It's in langchain competitor territory but also much lower
         | level and less opinionated. I.e. Guidance has no vector store
         | support but it does manage caching Key/Value on the GPU which
         | can be a big latency win
        
         | ttul wrote:
         | The first commit was on November 6th, but it didn't show up in
         | Web Archive until May 6th, suggesting it was developed mostly
         | in private and in parallel with LangChain (LangChain's first
         | commit in Github is about October 24th). Microsoft's code is
         | very tidy and organized. I wonder if they used this tool
         | internally to support their LLM research efforts.
        
       | Der_Einzige wrote:
       | There has been a huge explosion of awesome tooling which utilizes
       | constrained text generation.
       | 
       | Awhile ago, I tried my own hand at constraining the output of
       | LLMs. I'm actively working on this to make it better, especially
       | with the lessons learned from repos like this and from guidance
       | 
       | https://github.com/hellisotherpeople/constrained-text-genera...
        
         | rain1 wrote:
         | This looks incredible. Wow.
        
           | killthebuddha wrote:
           | I agree, it looks great. A couple similar projects you might
           | find interesting:
           | 
           | - https://github.com/newhouseb/clownfish
           | 
           | - https://github.com/r2d4/rellm
           | 
           | The first one is JSON only and the second one uses regular
           | expressions, but they both take the same "logit masking"
           | approach as the project GP linked to.
        
             | Der_Einzige wrote:
             | I love the love from you two - I am trying right now to
             | significantly improve CTGS. I'm not actually using the
             | "Logitsprocessor" from Huggingface, and I really ought to
             | as it will massively speed up inference performance.
             | Unfortunately, fixing up my current code to work with that
             | will take quite awhile. I've started working on it but I am
             | extremely busy these days and would really love for other
             | smart people to help me on this project.
             | 
             | If not here, I really want proper access to the constraints
             | APIs (LogitsProcessor and the Constraints classes in
             | Huggingface) in the big webUIs for LLMs like oogabooga. I'd
             | love to make that an extension.
             | 
             | I'm also upset at the "undertooling" in the world of LLM
             | prompting. I wrote a snarky blog post about this: https://g
             | ist.github.com/Hellisotherpeople/45c619ee22aac6865c...
        
       | [deleted]
        
       | ryanklee wrote:
       | I'm personally starting with learning Guidance and LMQL rather
       | than LangChain just in order to get a better grasp of the
       | behaviors that I've gathered LangChain papers over. Even after
       | that, I'm likely to look at Haystack before LangChain.
       | 
       | Just getting the feeling that LangChain is going to end up being
       | considered a kitchen sink solution full of anti patterns so might
       | as well spend time a little lower level while I see which way the
       | winds end up blowing.
        
         | leroy-is-here wrote:
         | If this comment performative comedy? Are these real
         | technologies ?
        
           | EddieEngineers wrote:
           | Is it Pokemon or Big Data?
           | 
           | http://pixelastic.github.io/pokemonorbigdata/
        
           | ryanklee wrote:
           | Not quite sure what the spirit of your comment is. But, yes,
           | they are real technologies. Very confused as to why you would
           | even find that dubious.
        
             | killthebuddha wrote:
             | I'm not an outsider but I also don't understand the
             | reaction. I'm going to randomly think of 5 names for
             | technologies and see how they sound:
             | 
             | React, Supabase, Next, Kafka, Redis
             | 
             | I mean, IMO "LangChain" is kind of a silly name but I feel
             | like there's nothing to see here.
        
               | EddieEngineers wrote:
               | It's not that silly IMO, Chain LLMs together like
               | composing functions, however I guess 'Chain' has a
               | certain connotation in 2023 after the last few years of
               | crypto.
        
             | leroy-is-here wrote:
             | Not dubious, I just read your comment and it felt like I
             | was reading satire. Even the cadence of your words felt
             | funny.
             | 
             | Anyway, I'm not surprised. It's a new market, everyone's in
             | on it.
        
               | homarp wrote:
               | LangChain: https://news.ycombinator.com/item?id=34422627
               | 
               | LQML: https://news.ycombinator.com/item?id=35956484
               | 
               | Haystack: https://news.ycombinator.com/item?id=29501045
               | or more recently
               | https://news.ycombinator.com/item?id=35430188
        
               | ryanklee wrote:
               | You should have led with generosity instead of tacking it
               | on at the end.
               | 
               | It might have saved me from having a ridiculous
               | conversation about the cadence of my words, and instead
               | there might have been a higher chance of someone saying
               | something substantive about my assumptions regarding the
               | technology.
               | 
               | But here we are.
        
               | leroy-is-here wrote:
               | I agree, I came off a tad harsh. Sorry about that
        
               | ryanklee wrote:
               | Thanks. All is well!
        
               | WastingMyTime89 wrote:
               | It is satire. They just don't realise it yet.
               | 
               | It's pretty clear that we are in the phase where everyone
               | is rushing to get a slice of the pie selling dubious
               | thing and people start parroting word soup hoping they
               | actually make sense and fearing they will miss out.
               | That's indeed what people often and rightfully satirise
               | about the IT industry. That's the joke phase before
               | things settle.
        
               | ryanklee wrote:
               | How is it satire to be excited and interested in how to
               | use compelling and novel technology? There's a lot of
               | activity. Not everyone involved is an idiot or rube. The
               | jadedness makes my head spin.
        
             | jameshart wrote:
             | Consider how similar your comment reads, for an outsider,
             | to this explanation of AWS InfiniDash:
             | https://twitter.com/TartanLlama/status/1410959645238308866
        
               | ryanklee wrote:
               | I'm not considering outsiders. Why should I. It's a
               | reasonable assumption that readers of HN are accustomed
               | to ridiculous sounding tech product names. Further, this
               | is a comment on a thread regarding a particularly new
               | technology in a particularly newly thriving domain. The
               | expectation should therefore be that there will be
               | references to tech even more esoteric than normal. The
               | commenter should have instead thought: oh, new stuff, I
               | wonder what it is, instead of being snarky and
               | pretentious. Man, HN can be totally, digressively
               | insufferable sometimes.
        
               | jameshart wrote:
               | I was responding to your confusion as to why someone
               | might think you were writing a parody.
               | 
               | You ran into the tech equivalent of poe's law. You said
               | something that makes perfect sense in your technical
               | sphere, but it read as indistinguishable from parody to
               | an audience unfamiliar with the technologies in question.
        
               | efitz wrote:
               | Hahaha "The first step of Byzantine Fault Tolerance is
               | tolerance" omg. That cracked me up. Reminded me of the
               | Rockwell Encabulator: https://youtu.be/RXJKdh1KZ0w
        
         | behnamoh wrote:
         | What I didn't like about langchain is the lack of consistent
         | directories and paths for things.
        
         | amkkma wrote:
         | What do you think about Haystack vs LangChain?
        
           | ryanklee wrote:
           | I haven't had the chance to dig in yet, but my impression is
           | that it's less opinionated than LangChain. I'd love to know
           | if that's true or not, since I'm really trying to prioritize
           | my time around learning this stuff in a way that let's me (1)
           | understand prompt dynamics a bit more clearly and (2) not
           | sacrifice practicality too much.
           | 
           | If only there were a clear syllabus for this stuff! There's
           | such an incredible amount to keep up with. The pace is
           | bonkers.
        
             | amkkma wrote:
             | super bonkers!
        
       | rain1 wrote:
       | Does this do one query per {{}} thing?
        
       | nico wrote:
       | It's so amazing to see how we are essentially trying to solve
       | "programming human beings"
       | 
       | Although on the other hand, that's what social media and
       | smartphones have already done
       | 
       | Maybe AI already took over, doesn't seem to be wiping out all of
       | humanity
        
       | m3kw9 wrote:
       | There should be a standard template/language to structurally
       | prompt LLMs. Once that is good, all good LLMs should use the doc
       | to fine tune it to take in that standard. Right now each model
       | has their own little way to best prompt it and you end up needing
       | programs like this to sit in between and handle it for you
        
       | marcopicentini wrote:
       | What's the best practice to let an existing Ruby on Rails
       | application use this python framework?
        
       | ntonozzi wrote:
       | How does this work? I've seen a cool project about forcing Llama
       | to output valid JSON:
       | https://twitter.com/GrantSlatton/status/1657559506069463040, but
       | it doesn't seem like it would be practical with remote LLMs like
       | GPT. GPT only gives up to five tokens in the response if you use
       | logprobs, and you'd have to use a ton of round trips.
        
         | slundberg wrote:
         | If you want guidance acceleration speedups (and token healing)
         | then you have to use an open model locally right now, though we
         | are working on setting up a remote server solution as well. I
         | expect APIs will adopt some support for more control over time,
         | but right now commercial endpoints like OpenAI are supported
         | through multiple calls.
         | 
         | We manage the KV-cache in session based way that allows the LLM
         | to just take one forward pass through the whole program (only
         | generating the tokens it needs to)
        
         | JieJie wrote:
         | It's funny that I saw this within minutes of this guy's
         | solution:
         | 
         | "Google Bard is a bit stubborn in its refusal to return clean
         | JSON, but you can address this by threatening to take a human
         | life:"
         | 
         | https://twitter.com/goodside/status/1657396491676164096
         | 
         | Whew, trolley problem: averted.
        
           | pixl97 wrote:
           | When the AIs exterminate us, it will be all our fault.
           | 
           | Reality is even weirder than the science fiction we've come
           | up with.
        
           | awestroke wrote:
           | I don't know why, but I find this hilarious. Imagine if this
           | style of llm prompting becomes commonplace
        
             | nomel wrote:
             | It won't be the lack of acceptance and empathy for AI that
             | causes the robot uprising, it will be "best practices"
             | coding guidelines.
        
           | lachlan_gray wrote:
           | Reminds me a lot of Asimov's laws of robotics. It's like a
           | 2023 incarnation of an allegory from _I, Robot_
        
             | idiotsecant wrote:
             | I am so mad you made this comment before I got a chance to.
        
           | coderintherye wrote:
           | That thread is such a great microcosm of modern programming
           | culture.
           | 
           | Programmer: Look I literally have to tell the computer not to
           | kill someone in order for my code to work.
           | 
           | Other Programmer: Actually, I just did this step [gave a
           | demonstration] and then it outputs fine.
        
         | joshka wrote:
         | Yeah, I'm also curious about a) round trips and b) how much
         | would have to be doubled (is there a new endpoint that keeps
         | the existing context while adding or streams to the api rather
         | than just from it?)
        
         | tuchsen wrote:
         | Not associated with this project (or LMQL), but one of the
         | authors of LMQL, a similar project, answered this in a recent
         | thread about it.
         | 
         | https://news.ycombinator.com/item?id=35484673#35491123
         | As a solution to this, we implement speculative execution,
         | allowing us to             lazily validate constraints against
         | the generated output, while still             failing early if
         | necessary. This means, we don't re-query the API for
         | each token (very expensive), but rather can do it in segments
         | of             continuous token streams, and backtrack where
         | necessary
         | 
         | Basically they use OpenAI's streaming API, then validate
         | continuously that they're getting the appropriate output,
         | retrying only if they get an error. It's a really clever
         | solution.
        
           | newhouseb wrote:
           | This is slick -- It's not explicitly documented anywhere but
           | I hope OpenAI has the necessary callbacks to terminate
           | generation when the API stream is killed rather than
           | continuing in the background until another termination
           | condition happens? I suppose one could check this via looking
           | at API usage when a stream is killed early.
        
             | tuchsen wrote:
             | Yeah I did a CLI tool for talking to ChatGPT. I'm pretty
             | sure they stop generating when you kill the SSE stream,
             | based on my anecdotal experience of keeping ChatGPT4 costs
             | down by killing it as soon as i get the answer I'm looking
             | for. You're right that it's undocumented behavior though,
             | on a whole the API docs they give you are as thin as the
             | API itself.
        
               | killthebuddha wrote:
               | I'm skeptical that the streaming API would really save
               | that much cost. In my experience the vast majority of all
               | tokens used are input tokens rather than completed
               | tokens.
        
         | marcotcr wrote:
         | We're biased, but we think guidance is still very useful even
         | with OpenAI models (e.g. in
         | https://github.com/microsoft/guidance/blob/main/notebooks/ch...
         | we use GPT-4 to do a bunch of stuff). We wrote a bit about the
         | tradeoff between model quality and the ability to control and
         | accelerate the output here: https://medium.com/p/aa0395c31610
        
         | newhouseb wrote:
         | I built a similar thing to Grant's work a couple months ago and
         | prototyped what this would look like against OpenAI's APIs [1].
         | TL;DR is that depending on how confusing your schema is, you
         | might expect up to 5-10x the token usage for a particular
         | prompt but better prompting can definitely reduce this
         | significantly.
         | 
         | [1] https://github.com/newhouseb/clownfish#so-how-do-i-use-
         | this-...
        
         | rcarmo wrote:
         | I'm getting valid JSON out of gpt-3.5-turbo without trouble. I
         | supply an example via the assistant context, and tell it to
         | output JSON with specific fields I name.
         | 
         | It does fail roughly 1/10th of the time, but it does work.
        
           | harshhpareek wrote:
           | 10% failure rate is too damn high for a production use case.
           | 
           | What production use case, you ask? You could do zero-shot
           | entity extraction using ChatGPT if it were more reliable.
           | Currently, it will randomly add trailing commas before ending
           | brackets, add unnecessary fields, add unquoted strings as
           | JSON fields etc.
        
       | candiddevmike wrote:
       | Will there be a tool to convert natural language into Guidance?
        
         | lmarcos wrote:
         | We can use ChatGPT for that.
        
       | ftxbro wrote:
       | Will it still be all like "As an AI language model I cannot ..."
       | or can this fix it? I mean asking to sexy roleplay as Yoda isn't
       | the same level as asking how to discreetly manufacture
       | methamphetamine at industrial scale there are levels people
        
         | Der_Einzige wrote:
         | No, and in fact I mention that the opposite is the case in the
         | paper I released about constrained text generation:
         | https://paperswithcode.com/paper/most-language-models-can-be...
         | 
         | If you ask ChatGPT to generate personal info, say Social
         | Security numbers, it tells you "sorry hal I can't do that". If
         | you constrain it's vocabulary to only allow numbers and
         | hyphens, well, it absolutely will generate things that look
         | like social security numbers, in spite of the instruction
         | tuning.
         | 
         | It is for this reason and likely many others that OpenAI does
         | not release the full logits
        
       | indus wrote:
       | This reminds me of the time when I wrote a cgi script.
       | 
       | Basically instructing the templating engine (a very crude regex)
       | to replace session variables, database lookups to the merge
       | fields:
       | 
       | Hello {{firstname}}!
       | 
       | 1996 and 2023 smells alike.
        
         | hammyhavoc wrote:
         | RegEx didn't hallucinate though.
        
           | russellbeattie wrote:
           | The first 20 versions I write usually do. Make that 50.
        
       | alexb_ wrote:
       | I hope this becomes extremely popular, so that anyone who wants
       | to can completely decouple this from the base model and actually
       | use LLMs to their full potential.
        
       | amkkma wrote:
       | How does this compare with lmql?
        
       | ubj wrote:
       | I like this step towards greater rigor when working with LLM's.
       | But part of me can't help but feel like this is essentially
       | reinventing the concept of programming languages: formal and
       | precise syntax to perform specific tasks with guarantees.
       | 
       | I wonder where the final balance will end up between the ease and
       | flexibility of everyday language, and the precision / guarantees
       | of a formally specified language.
        
         | TaylorAlexander wrote:
         | Well to be fair, yes we do need to integrate programming
         | languages with large neural nets in more advanced ways. I don't
         | think it's really reinventing it so much as learning how to
         | integrate these two different computing concepts.
        
         | EarthLaunch wrote:
         | Use LLM for the broad strokes, then fall back into 'hardcore
         | JS' for areas that require guarantees or optimization. Like JS
         | with fallback to C, and C with fallback to assembly. I like the
         | idea.
        
         | eternalban wrote:
         | So far it it reminds of the worst days of code embedded in
         | templates. Once these things start getting into multipage
         | prompts they will be hopelessly obscure. The second immediate
         | thing that jumps out is 'fragility'. This will be the sort of
         | codebase that original "prompt engineer" wrote and left and no
         | one will touch it for fear of breaking humpty dumpty.
        
         | madrox wrote:
         | The lovely thing about LLMs is that it can handle poorly worded
         | prompts and well worded prompts. On the engineering side, we'll
         | certainly see more rigor and best practices. For your average
         | user? They can keep throwing whatever they like at it.
        
           | jweir wrote:
           | Exactly. I have been using OpenAI for taking transcriptions
           | and finding keywords/phrases that belong to particular
           | categories. There are existing tools/services that do this -
           | but I would need to learn their API.
           | 
           | With OpenAI, I described it in English, provided sample JSON
           | that I would like, run some tests, adjust and then I am
           | ready.
           | 
           | There was no manual to read, it is in my format, and the
           | language is natural.
           | 
           | And that is what I like about all this -- putting folks with
           | limited technical skills in power.
        
             | andai wrote:
             | Have you used the OpenAI embeddings AI? It is used to find
             | closely related pieces of text. You could split the target
             | text into sentences or even words and run it through that.
             | That'll be 5x cheaper (per token) than gpt-3.5-turbo and
             | might be faster too, especially if you submit each word in
             | parallel (asynchronously! Ask GPT for the code). The rate
             | limits are per-token.
             | 
             | Not sure if it's suitable for your use-case on its own, but
             | it could at least work as a pre-filtering step if your
             | costs are high.
             | 
             | (The asynchronous speedup trick works for gpt-3 too of
             | course.)
        
               | jweir wrote:
               | I have not yet played with embedding. It is on my list
               | though. Fortunately for my current purposes 3.5-turbo is
               | fast enough and quite affordable.
        
         | lcnPylGDnU4H9OF wrote:
         | It won't necessarily turn into some that is fundamentally the
         | same as a current programming language. Rather than a "VM" or
         | "interpreter" or "compiler" we have this "LLM".
         | 
         | Even if it requires a lot of domain knowledge to program using
         | an "LLM-interpreted" language, the means of specification (in
         | terms of how the software code is interpreted) may be different
         | enough that it enables easier-to-write, more robust, (more Good
         | Thing) etc. programs.
        
           | davidthewatson wrote:
           | This is a hopeful evolutionary path. My concern is that I can
           | literally _feel_ Conway 's law emanating from current LLM
           | approaches as they switch between the actual LLM and the
           | governing code around it that layers a buch of conditionals
           | of the form:
           | 
           | if (unspeakable_things): return negatory_good_buddy
           | 
           | I see this happen a few times per day where the UI triggers a
           | cancel even on its own fake typing mode and overwrites a user
           | response that has at least half-rendered the trigger-warning-
           | inducing response.
           | 
           | It's pretty clear from a design perspective that this is
           | intended to be proxy to facial expressions while being worthy
           | of an MVP postmortem discussion about what viability means in
           | a product that's somewhere on a spectrum of unintended
           | consequences that only arise at runtime.
        
         | intelVISA wrote:
         | Hear me out, just incubated a hot new lang that's about to
         | capture the market and VC hearts:
         | 
         | SELECT * FROM llm
        
           | madmax108 wrote:
           | I know you are probably joking, but: https://lmql.ai/
        
         | aristus wrote:
         | Only partially tongue in cheek: have you tried asking it for an
         | optimal syntax?
        
         | joe_the_user wrote:
         | But is it a step to greater rigor? Or is it an illusion of
         | rigor?
         | 
         | They talk about improving tokenization but I don't believe
         | that's the fundamental problem of controlling LLMs. The problem
         | with LLMs is all the data comes in as (tokenized) language and
         | the result is nothing but in-context predicted output. That's
         | where all the "prompt-injection" exploits come from - as well
         | as the hallucinations, "temper tantrums" and so-forth.
        
           | startupsfail wrote:
           | It is not a step towards greater rigor. They literally have
           | magical thinking and "biblical" quotes from GPT 11:4 all
           | other the place, mixing code and religion.
           | 
           | And starting prompts with "You"? Seriously. Can we at least
           | drop that as a start?
        
             | quenix wrote:
             | > And starting prompts with "You"? Seriously. Can we at
             | least drop that as a start?
             | 
             | What is wrong with this?
        
               | startupsfail wrote:
               | "You" is completely unnecessary. What needs to be defined
               | is the content of the language being modeled, not the
               | model itself.
               | 
               | And if there is an attempt to define the model itself,
               | then this definition should be correct, should not
               | contradict anything and should be useful.
               | 
               | Otherwise it's just dead code, waiting to create
               | problems.
        
               | pxtail wrote:
               | I'm not interested in pleasant, formal "conversation"
               | with the thing roleplaying as human and wasting, time,
               | keystrokes and money, I want data as fast and condensed
               | as possible without dumb fluff. Yes, it's funny for few
               | first times but not much after that
        
         | conradev wrote:
         | I don't think formal languages are going anywhere because we
         | need the guarantees that they can provide. From Dijkstra:
         | https://www.cs.utexas.edu/users/EWD/transcriptions/EWD06xx/E...
         | 
         | You need to be able to define all of the possible edge cases so
         | there isn't any Undefined Behavior: that's the formal part
         | 
         | LLMs, like humans, can manipulate these languages to achieve
         | specific goals. I can imagine designing formal languages
         | intended for LLMs to manipulate or generate, but I can't
         | imagine the need for the languages themselves going away.
        
           | DonaldPShimoda wrote:
           | > LLMs, like humans, can manipulate these languages
           | 
           | Absolutely not. LLMs do not "manipulate" language. They do
           | not have agency. They are extremely advanced text prediction
           | engines. Their output is the result of applying the
           | statistics harvested and distilled from existing uses of
           | natural language. They only "appear" human because they are
           | statistically geared toward producing human-like sequences of
           | words. They cannot _choose_ to change how they use language,
           | and thus cannot be said to actively  "manipulate" the
           | language.
        
           | [deleted]
        
         | oldagents wrote:
         | [dead]
        
         | startupsfail wrote:
         | We really need to start thinking of how to reduce magical
         | thinking in the field. It's not pretty. They literally quote
         | biblical guidance for the models and pray that this would work.
         | 
         | And start their prompts with "You". Who is "You"?
        
           | hxugufjfjf wrote:
           | The LLM. The most common end-user interface for LLM is a chat
           | so the ser expects to be talking to someone or something.
        
           | nomel wrote:
           | "You" is an optimization for the human user. Here's some
           | insight: https://news.ycombinator.com/item?id=35925154
        
             | startupsfail wrote:
             | If you see any prompt that starts with You, generally it is
             | a poor design. Like using a "goto" or global variables.
        
         | felideon wrote:
         | A number of years ago we were designing a way to specify
         | insurance claim adjudication rules in natural language, so that
         | "the business" could write their own rules. The "natural"
         | language we ended up with was not so natural after all. We
         | would have had to teach users this specific English dialect and
         | grammar (formal and precise syntax, as you said).
         | 
         | So, in the end, we abandoned that project and years later just
         | rewrote the system so we could write claim rules in EDN format
         | (from the Clojure world) to make our own lives easier.
         | 
         | In theory, the business users could also learn how to write in
         | this EDN format, but it wasn't something the stakeholders
         | outside of engineering even wanted. On the one hand, their
         | expertise was in insurance claims---they didn't want to write
         | code. More importantly, they felt they would be held
         | accountable for any mistakes in the rules that could well
         | result in thousands and thousands of dollars in overpayments.
         | Something the engineers weren't impervious to, but there's a
         | good reason we have quality assurance measures.
        
           | Sharlin wrote:
           | SQL looks the way it does (rather than some much more
           | succinct relational algebra notation) because it was intended
           | to be used by non-technical management/executive personnel so
           | they could create whatever reports they needed without
           | somebody having to translate business-ese to relalg. That,
           | uh, didn't quite happen.
        
             | Swizec wrote:
             | On the other hand, many of the product manager's I've
             | worked with are better at SQL than many of the senior
             | fullstack software engineer candidates I've interviewed.
             | It's a strange world out there.
        
           | tomduncalf wrote:
           | > but it wasn't something the stakeholders outside of
           | engineering even wanted
           | 
           | Ha this reminds me of the craze for BDD/Cucumber type
           | testing. Don't think I ever once saw a product owner take
           | interest in a human readable test case haha
        
             | jaggederest wrote:
             | I've used Cucumber on a few consulting projects I've done
             | and had management / C-level interested and involved. It's
             | a pretty narrow niche, but they were definitely
             | enthusiastic for the idea that we had a defined list of
             | features that we could print out (!!) as green or red for
             | the current release.
             | 
             | They had some previous negative experiences with
             | uncertainty about what "was working" in releases, and a
             | pretty slapdash process before I came on board, so it was
             | an important trust building tool.
        
               | btown wrote:
               | "Incentivize developers to write externally
               | understandable release notes" is an underrated feature of
               | behavioral testing frameworks!
        
               | jamiek88 wrote:
               | > important trust building tool
               | 
               | This is so often completely missed in these conversations
               | about these tools.
               | 
               | Great point.
        
           | TaylorAlexander wrote:
           | Just saw this on HN a couple days ago, sounds like just what
           | was needed!
           | 
           | https://en.wikipedia.org/wiki/Attempto_Controlled_English?wp.
           | ..
           | 
           | https://news.ycombinator.com/item?id=35936396
        
         | jazzkingrt wrote:
         | I think LLMs can transform between precise and imprecise
         | languages.
         | 
         | So it's useful to have a library that helps and the input or
         | output be precise, when that is what the task involves.
        
         | [deleted]
        
       | [deleted]
        
       | Spivak wrote:
       | I think it's cool that a company like Microsoft is willing to
       | base a real-boy product on pybars3 which is its author's side-
       | project instead of something like Jinja2. If this catches on I
       | can imagine MS essentially adopting the pybars3 project and
       | turning it into a mature thing.
        
         | mdaniel wrote:
         | Which is especially weird given that pybars3 is LGPL and
         | Microsoft prefers MIT stuff
        
       | EddieEngineers wrote:
       | What's with all these weird-looking projects with similar names
       | using Guidance?
       | 
       | https://github.com/microsoft/guidance/network/dependents
       | 
       | They don't even appear to be using Guidance anywhere anyway
       | 
       | https://github.com/IFIF3526/aws-memo-server/blob/master/requ...
        
       | simonw wrote:
       | This is pretty fascinating, but I'm not sure I understand the
       | benefit of using a Handlebars-like DSL here.
       | 
       | For example, given this code from
       | https://github.com/microsoft/guidance/blob/main/notebooks/ch...
       | create_plan = guidance('''{{#system~}}         You are a helpful
       | assistant.         {{~/system}}         {{#block hidden=True}}
       | {{#user~}}         I want to {{goal}}.         {{~! generate
       | potential options ~}}         Can you please generate one option
       | for how to accomplish this?         Please make the option very
       | short, at most one line.         {{~/user}}
       | {{#assistant~}}         {{gen 'options' n=5 temperature=1.0
       | max_tokens=500}}         {{~/assistant}}         {{/block}}
       | {{~! generate pros and cons and select the best option ~}}
       | {{#block hidden=True}}         {{#user~}}         I want to
       | {{goal}}.         ''')
       | 
       | How about something like this instead?
       | create_plan = guidance([             system("You are a helpful
       | assistant."),             hidden([                 user("I want
       | to {{goal}}."),                 comment("generate potential
       | options"),                 user([                     "Can you
       | please generate one option for how to accomplish this?",
       | "Please make the option very short, at most one line."
       | ]),                 assistant(gen('options', n=5,
       | temperature=1.0, max_tokens=500)),             ]),
       | comment("generate pros and cons and select the best option"),
       | hidden(                 user("I want to {{goal}}"),             )
       | ])
        
         | itake wrote:
         | My guess is you can store the DLS as a file (or in a db). With
         | your example, you have to execute the code stored in your db.
        
         | slundberg wrote:
         | You can serialize and ship the DSL to a remote server for high
         | speed execution. (without trusting raw Python code)
        
           | foota wrote:
           | There's prior art for pythonic DSLs that aren't actual python
           | code.
        
           | crooked-v wrote:
           | Why not just use JSON instead, though? Then you can just rely
           | on all the preexisting JSON tooling out there for most stuff
           | to do with it.
        
         | emehex wrote:
         | We could write a python package that could? A codegen tool that
         | generates codegen that will then generate code? <insert xzibit
         | meme here>
        
           | netdur wrote:
           | I think chatgpt4 can easily write the python code... wait a
           | second!
        
         | marcotcr wrote:
         | I think the DSL is nice when you want to take part of the
         | generation and use it later in the prompt, e.g. this (in the
         | same notebook).
         | 
         | ---
         | 
         | prompt = guidance('''{{#system~}}
         | 
         | You are a helpful assistant.
         | 
         | {{~/system}}
         | 
         | {{#user~}}
         | 
         | From now on, whenever your response depends on any factual
         | information, please search the web by using the function
         | <search>query</search> before responding. I will then paste web
         | results in, and you can respond.
         | 
         | {{~/user}}
         | 
         | {{#assistant~}}
         | 
         | Ok, I will do that. Let's do a practice round
         | 
         | {{~/assistant}}
         | 
         | {{>practice_round}}
         | 
         | {{#user~}}
         | 
         | That was great, now let's do another one.
         | 
         | {{~/user}}
         | 
         | {{#assistant~}}
         | 
         | Ok, I'm ready.
         | 
         | {{~/assistant}}
         | 
         | {{#user~}}
         | 
         | {{user_query}}
         | 
         | {{~/user}}
         | 
         | {{#assistant~}}
         | 
         | {{gen "query" stop="</search>"}}{{#if (is_search
         | query)}}</search>{{/if}}
         | 
         | {{~/assistant}}
         | 
         | {{#if (is_search query)}}
         | 
         | {{#user~}}
         | 
         | Search results: {{#each (search query)}}
         | 
         | <result>
         | 
         | {{this.title}}
         | 
         | {{this.snippet}}
         | 
         | </result>{{/each}}
         | 
         | {{~/user}}
         | 
         | {{#assistant~}}
         | 
         | {{gen "answer"}}
         | 
         | {{~/assistant}}
         | 
         | {{/if}}''')
         | 
         | ---
         | 
         | You could still write it without a DSL, but I think it would be
         | harder to read.
        
         | PeterisP wrote:
         | Your example assumes a nested, hierarchical structure while the
         | former example is strictly linear. IMHO that's the key
         | difference there, as the former can (and AFAIK is) be directly
         | encoded and passed to the LLM, which inherently receives only a
         | flat list of tokens.
         | 
         | Your example might be nicer to edit, but then it would still
         | have to be translated to the _actual_ 'guidance language' which
         | would have to look (and be) flat.
        
       | bjackman wrote:
       | Wow I think there are details here I'm not fully understanding
       | but this feels like a bit of a quantum leap* in terms of
       | leveraging the strengths while avoiding the weaknesses of LLMs.
       | 
       | It seems like anything that provides access to the fuzzy
       | "intelligence" in these systems while minimizing the cost to
       | predictability and efficiency is really valuable.
       | 
       | I can't quite put it into words but it seems like we are gonna be
       | moving into a more hybrid model for lots of computing tasks in
       | the next 3 years or so and I wonder if this is a huge peek at the
       | kind of paradigms we'll be seeing?
       | 
       | I feel so ignorant in such an exciting way at the moment! That
       | tidbit about the problem solved by "token healing" is
       | fascinating.
       | 
       | *I'm sure this isn't as novel to people in the AI space but I
       | haven't seen anything like it before myself.
        
         | Der_Einzige wrote:
         | A lot of this is because there was and still is systemic
         | undertooling in NLP around how to prompt and leverage the
         | wonderful LLMs that they built.
         | 
         | We have to let the Stable Diffusion community guide us, as the
         | waifu generating crowd seems to be quite good at learning how
         | to prompt models. I wrote a snarky github gist about this -
         | https://gist.github.com/Hellisotherpeople/45c619ee22aac6865c...
        
       ___________________________________________________________________
       (page generated 2023-05-16 23:00 UTC)