[HN Gopher] Text Is the Universal Interface
       ___________________________________________________________________
        
       Text Is the Universal Interface
        
       Author : marban
       Score  : 99 points
       Date   : 2022-10-02 03:14 UTC (3 days ago)
        
 (HTM) web link (scale.com)
 (TXT) w3m dump (scale.com)
        
       | asim wrote:
       | The idea is inherently very romantic and alluring for developers:
       | the unix philosophy. Arguably yes, it makes total sense, but the
       | difference is that the tools to both develop software and combine
       | them via unix pipes made that a beautiful and simple experience.
       | I don't think AI and language models are there yet. Ultimately we
       | have to move beyond the text just slightly to passing some
       | structured data, because then we can pipe it not between
       | processes, but actual APIs and services.
       | 
       | In the end a model likely ends up being deployed upon a set of
       | APIs with a textual interface for the end user. Command and
       | control system with built in intelligence. Adept + Inflection are
       | likely the thing that get us there, but it's more likely google
       | develops it first or has because effectively Google Search is a
       | text box that's being piped into AI at this very moment to give
       | you the answers you're looking for. It's only a matter of time
       | before that turns into buying, calling, creating, processing or
       | ordering whatever you want too.
        
       | blueflow wrote:
       | Console/Shell? Learn once, use forever. Graphical GUI? Re-learn
       | your interface every 5 years (or even shorter) when the some
       | company decides it needs to change.
        
         | gw99 wrote:
         | I can never remember all the flags and syntax. Also I hate
         | writing parsers all day which is what you really have to do
         | with text interfaces. UIs are discoverable.
         | 
         | There are some compromises in the middle somewhere. I'm liking
         | Apple Shortcuts when it fits the problem domain I am within.
        
           | blueflow wrote:
           | I do forget flags as well, quite often. But instead disliking
           | the console for it, i open the a manual with `man command-
           | name` and use the '/' and 'n' keys to search for that option
           | with good success so far.
           | 
           | I do get mad tho when a program doesn't have a manpage.
        
           | spc476 wrote:
           | It depends upon the command line interface. A Cisco router
           | CLI is very discoverable. Sitting at the prompt, hit '?' and
           | get a list of commands. Type the command, then '?', get a
           | list of options. At each point in typing out the command, you
           | can always get a list of what you can type next. And even
           | better, you only have to type enough to distinguish from
           | multiple commands/options. So 'show interface Ethernet/0' and
           | 'sh in Eth/0' are the same command.
           | 
           | And on Unix, there's always the man command. Also, most
           | commands will repond to "command -h" or "command --help", but
           | yes, it's a far cry from the Cisco router command line.
        
             | gw99 wrote:
             | Oh god no I was a CCNA. IOS is horrible.
             | 
             | manpages aren't always that great. Unless they are on a BSD
             | derivative...
             | 
             | PowerShell had some good ideas about making commands
             | discoverable but the nuances and implementation really
             | killed it for me.
        
         | the-printer wrote:
         | I for one endorse a general, casual and technical re-emphasis
         | on the shell. That and a pastiche of Die Neue Typographie but
         | geared toward GUIs.
        
       | pjfin123 wrote:
       | Almost anything can be modeled as a translation of one sequence
       | of character to another sequence of text characters.
        
         | andirk wrote:
         | Down to right before machine code, isn't it all just strings of
         | characters aka strings of bytes aka strings representing
         | assembly commands and values?
        
       | TheRealNGenius wrote:
        
       | [deleted]
        
       | dkjaudyeqooe wrote:
       | Text is universal because it's language, but it's not always the
       | optimum interface.
       | 
       | Try pointing to a person deep in a large crowd versus using
       | language. Language is also inexact and open to interpretation.
       | Often simpler "direct" interfaces are more effective.
        
         | bryanrasmussen wrote:
         | no text is near universal because it's language
         | 
         | >Often simpler "direct" interfaces are more effective.
         | 
         | and can probably be used by people who cannot use text or
         | language.
        
       | Ruq wrote:
       | History repeats itself.
        
         | f1shy wrote:
         | Could you say more about it? Do you mean, we are going back to
         | text based terminal? Sorry. Cannot follow.
        
           | jrm4 wrote:
           | In a sense.
           | 
           | I'd argue the first major interface "breakthrough" after the
           | iPhone was Siri/Alexa et al.
        
       | whoisthemachine wrote:
       | > Seeing these quite disparate tasks being tamed under one
       | unlikely roof, we have to ask - what other difficult problems can
       | simply be transcribed into text and asked to an oracular software
       | intelligence? McIlroy must be smiling somewhere.
       | 
       | This quote gave me pause about the entire article. It sounds like
       | they're talking about some long gone philosopher, but according
       | to Wikipedia, Douglas McIlroy, while old, is still kicking [0],
       | and might be able to provide his own impression on the idea that
       | large language models have any relation to the Unix philosophy,
       | without anyone projecting their own beliefs onto him.
       | 
       | [0] https://en.wikipedia.org/wiki/Douglas_McIlroy
        
       | naillo wrote:
       | As a visual thinker I actually feel really limited by all these
       | text interfaces. I hear a lot of people with aphantasia in this
       | space though (sama, emad) so for them anything other than this
       | idea is probably unthinkable but personally it really doesn't
       | feel like the panacea that others make it out to be.
        
       | widowlark wrote:
       | really loving that the first thing in this article is a photo
        
       | owenpalmer wrote:
       | Text is a very high level construct. I wouldn't call it
       | universal.
        
       | falcolas wrote:
       | Define text. Pure ASCII or Unicode? If Unicode, how much Unicode
       | should be allowed? How do you identify the end of a Unicode text
       | stream, other than closing the input stream (or explicitly
       | waiting for a out-of-band signal like a form submission)?
        
         | mjevans wrote:
         | 8 bit clean octet transfer; any interpretation of value and/or
         | protocol other than an out of band end of stream / termination
         | is up to the programs involved.
         | 
         | Is that a valid UTF-8 encoded JSON? Formatted newline
         | delimitated record sets? Some formatting of data no one's
         | thought up yet? Dunno, unless the program is instructed to do a
         | specific thing with / to the data don't care.
        
         | eimrine wrote:
         | > define text
         | 
         | Text interface is any non-GUI?
        
           | mostlylurks wrote:
           | There are many more types of interfaces than GUIs and text-
           | based interfaces. Voice user interfaces (VUIs), for instance,
           | which have become commonplace in recent years, especially
           | with virtual assistants like Alexa or Siri.
        
             | jrm4 wrote:
             | VUI's are just text interfaces with a different input
             | system and more error checking. They implement operations
             | that give you convenience in exchange for versatility.
        
               | mostlylurks wrote:
               | Even if the two types of interfaces share strong
               | similarities in their most commonplace usage, they are
               | nevertheless accessed through two distinct mediums (from
               | which they get their names (TUI / VUI)). Text is not
               | speech, and speech is not text, and the possibility space
               | of what can be realistically (in a human-friendly manner)
               | provided as input or output on each is quite different.
               | On VUIs, for instance, you can do things like query for a
               | song by humming it [0], which would be very difficult to
               | provide a purely textual alternative for for use in a
               | TUI, whereas text-based interfaces make it easier to do
               | things like looking at large swathes of structured
               | content or copypasting a long piece of text to use as
               | input.
               | 
               | [0]: https://blog.google/products/search/hum-to-search/
        
           | falcolas wrote:
           | GUIs can have representations of text, and text input fields
           | though...
        
           | f1shy wrote:
           | Where is the line between GUI and TUI? Or TUI and CLI?
           | 
           | Also, where does a system like Genera fits? Was it CLI, TUI
           | or GUI?!
        
         | rcoveson wrote:
         | > Define text.
         | 
         | UTF-8 for all new projects.
         | 
         | > If Unicode, how much Unicode should be allowed?
         | 
         | I think the question is more like "how much should be
         | supported", which is project dependent. If it's just a question
         | of "allowance" then the answer is probably "all of it". If
         | you're a UTF-8 processor, you shouldn't go out of your way to
         | discard or disallow certain codepoints without a good reason.
         | 
         | > How do you identify the end of a Unicode text stream, other
         | than closing the input stream (or explicitly waiting for a out-
         | of-band signal like a form submission)?
         | 
         | I think the UNIX-y answer here is "end of text stream" and
         | "closed stream" are the same. But if you do want to wrap text
         | streams in another text stream you have a couple options: HTTP
         | uses Content-Length and Content-Encoding: chunked (length-
         | prefixing), while programming and markup languages use
         | delimiter characters which must be escaped in the inner stream.
         | 
         | What you definitely should not do is, amusingly, what C does:
         | Reserve a text character (NUL) to use as a delimiter, and hope
         | that character doesn't appear in your content.
        
         | jrm4 wrote:
         | "Define text" is a pretty good question, but "ASCII" v.
         | "Unicode" is frankly one of those _deeply_ nerdy things that
         | won 't matter much at all in the grand scheme of all of this.
         | 
         | I suspect those familiar with pre-computer
         | languages/linguistics would be able to come up with a pretty
         | good working definition here, but something like "communication
         | systems that utilize a relatively small number of repeating
         | symbols to convey meaning, often associated with
         | vocalizations?"
        
         | akira2501 wrote:
         | > How do you identify the end of a Unicode text stream
         | 
         | "Why not just a zero byte?"
        
           | falcolas wrote:
           | You'd need 4 of them to avoid ambiguity. The C string
           | delineation model has a lot of issues we probably don't want
           | to keep. :)
        
             | rmasters wrote:
             | In some encodings? One zero is sufficient for UTF-8.
        
         | every wrote:
         | "Pure" ASCII would be 7-bit US-ASCII derived directly from the
         | teletypewriter command set. It consists of exactly 128
         | character codes, of which roughly 30% are unprintable. Anything
         | beyond that, such as code page 437, are not "pure". It should
         | also be noted that the first 128 character codes of UTF-8 are
         | "pure" 7-bit US-ASCII...
        
         | anotheronebites wrote:
         | ASCII and unicode are subtly different things.
         | 
         | you should compare ASCII with one of the UTF encodings of
         | unicode.
         | 
         | the 'unicode' for ASCII's would be just the english alphabet
         | extended with digits and a few more special control characters.
        
           | falcolas wrote:
           | Constraining Unicode encodings to a fewer than 4 bytes means
           | we limit how many countries can use text interfaces in their
           | language. Or how much data from those countries can be passed
           | between programs.
        
             | blueflow wrote:
             | We do not have enough countries to fill up all that space.
             | For UTF-8 with a 4-byte restriction, less than 18% of the
             | available space is currently allocated to blocks.
        
           | giantrobot wrote:
           | That's what the GP is saying. You can't really tell _a
           | priori_ what encoding to use for a stream of  "text" (bytes).
           | Without some sort of metadata about the stream you just have
           | to guess. Convention will help you make an informed guess but
           | it's not guaranteed to be correct. Then stuff breaks in
           | unexpected and stupid ways.
        
       | VyseofArcadia wrote:
       | Did no one read beyond the first paragraph?
       | 
       | This is an article about the effectiveness of large language
       | models, which have learned to do tasks beyond regurgitating text
       | despite only having been trained to regurgitate text.
       | 
       | As long as you are able to describe the task as a text stream and
       | are ok with having a text stream as output, GPT-3 might actually
       | be able to do the task with just a couple of examples as context.
       | I saw a talk at Black Hat about using GPT-3 as a spam filter.
       | Because GPT-3 has trained on so much spam, it knows what spam
       | looks like. It just needs a couple of examples, and off it goes.
       | 
       | > The most complicated reasoning programs in the world can be
       | defined as a textual I/O stream to a leviathan living on some
       | technology company's servers.
       | 
       | The references to the UNIX philosophy feel a bit tacked on,
       | honestly. Probably just there to grab attention.
        
         | verisimilitudes wrote:
         | I did; it's a shitty article that tries to connect the
         | braindead UNIX _philosophy_ to neural network nonsense. Of
         | course, this is more than enough to reach the front page of
         | _Hacker News_.
        
           | comfypotato wrote:
           | I lost interest in the article itself, but I'm currently down
           | a rabbit hole of one of its links [1]. It's the webpage of
           | some random (random to me that is) person, but it has quite a
           | bit of content (and links to other content). I consider this
           | find a win!
           | 
           | [1] http://www.catb.org/~esr/
        
             | foooobaba wrote:
             | When I was 12 I followed ESR's guide on "how to become a
             | hacker". I wanted to learn hacking but didn't know where to
             | start or have anyone to ask. Started with html then c++ and
             | using linux. For me, it's probably one of the most
             | important things I read, because it ultimately altered the
             | trajectory of my whole life.
             | 
             | http://www.catb.org/~esr/faqs/hacker-howto.html
        
             | 0x445442 wrote:
             | ESR is random to you? Oh youngling, much to learn you have.
        
         | [deleted]
        
         | [deleted]
        
         | stinkytaco wrote:
         | It briefly discusses DALL-E and imagine models near the end,
         | which seems to paper over a big part research in AI right now,
         | but that also wasn't the point of the article, which is fair.
         | What I find interesting is that this reminds me how AI is being
         | trained on a very narrow set of human inputs. Images, text,
         | possibly some video, but there's a huge amount of data
         | governing human behavior that is none of those things. Facial
         | expressions, body language, tone, diet, environment, etc. etc.
         | etc. I'm sure it's coming, but it feels a long way off.
        
         | zwieback wrote:
         | The answer is "no, nobody read the article".
        
       | ackfoobar wrote:
       | The by-line caught my eye. The author is the roon that coined the
       | terms "wordcel" and "shape rotator".
        
         | merlincorey wrote:
         | So are you a wordcel or a shape rotator, apparently aka
         | mathcel? [0]
         | 
         | [0] https://knowyourmeme.com/memes/cultures/wordcel-shape-
         | rotato...
        
       | TedKroft wrote:
        
       | verisimilitudes wrote:
       | > Computer software came to maturity in the late 1960s.
       | 
       | Fucking idiots.
        
         | rdlw wrote:
         | Computer software came to maturity in 2017 with the release of
         | Fortnite.
        
       | rckrd wrote:
       | Maybe images are the universal interface. With some of the
       | advancements in ML, we have different decoders: image-to-text
       | (OCR), layout information (object recognition), and other
       | metadata (formatting, fonts, etc.).
       | 
       | Now, with diffusion-based models like Stable Diffusion and
       | DALL-E, we have an encoder - text-to-image.
       | 
       | Natural analogy to how humans perceive the world and how we've
       | designed our own human-computer interfaces.
       | 
       | [0] https://matt-rickard.com/screenshots-as-the-universal-api [1]
       | https://twitter.com/mattrickard/status/1577321709350268928
        
       ___________________________________________________________________
       (page generated 2022-10-05 23:00 UTC)