[HN Gopher] AI: First New UI Paradigm in 60 Years ___________________________________________________________________ AI: First New UI Paradigm in 60 Years Author : ssn Score : 158 points Date : 2023-06-19 18:11 UTC (4 hours ago) (HTM) web link (www.nngroup.com) (TXT) w3m dump (www.nngroup.com) | retrocryptid wrote: | <unpopular-opinion> | | Bardini's book about Doug Engelbart recaps a conversation between | Engelbart and Minsky about the nature of natural language | interfaces... that took place in the 1960s. | | AI interfaces taking so long has less to do with the technology | (I mean... Zork understood my text sentences well enough to get | me around a simulated world) and more to do with what people are | comfortable with. | | Lowey talked about MAYA (Most Advanced Yet Acceptable.) I think | it's taken this long for people to be okay with the inherent | slowness of AI interfaces. We needed a generation or two of users | who traded representational efficiency for easy to learn | abstractions. And now we can do it again. You can code up a demo | app using various LLMs, but it takes HOURS of back and forth to | get to the point it takes me (with experience and boilerplate) | minutes to get to. But you don't need to invest in developing the | experience. | | And I encourage every product manager to build a few apps with AI | tools so you'll more easily see what you're paying me for. | | </unpopular-opinion> | EGreg wrote: | FB's AI head just said LLMs are a fad. | | I thought about how to use them... I wish they could render an | interface (HTML and JS at least, but also produce artifacts like | PowerPoints). | | What is really needed is for LLMs to produce some structured | markup, that can then be rendered as dynamic documents. Not text. | | As input, natural language is actually inferior to GUIs. I know | the debate between command line people and GUI people and LLMs | would seem like they'd boost the command-line people's case, but | any powerful system would actually benefit from a well designed | GUI. | EGreg wrote: | Here is the main reason: | | Any sufficiently advanced software has deep structure and | implementation. It isn't like a poet who can just bullshit some | rhymes and make others figure out what they mean. | | The computer program expects some definite inputs which it | exposes as an API eg a headless CMS via HTTP. | | Similar with an organization that can provide this or that | servicd or experience. | | Therefore given this rigidity, the input has limited options at | every step. And a GUI can gracefully model those limitations. A | natural language model will make you think there is a lot of | choice but really it will boil down to a 2018-era chatbot that | gives you menus at every step and asks whether you want A, B or | C. | dlivingston wrote: | As someone who just spent 2 hours in my company's Confluence | site, trying to track down the answer to a single question that | could have been resolved in seconds by an LLM trained on an | internal corporate corpus -- LLMs are very much not a fad. | EGreg wrote: | How do you know the answer is right? | | Because it linked you to the source? | | Like a vector database would? Google offered to index sites | since 1996. | dlivingston wrote: | We have internal search. Finding things isn't the problem. | It's contextualizing massive amounts of text and making it | queryable with natural language. | | The question I was trying to solve was -- "what is feature | XYZ? How does it work in hardware & software? How is it | exposed in our ABC software, and where do the hooks exist | to interface with XYZ?" | | The answers exist across maybe 30 different Confluence | pages, plus source code, plus source code documentation, | plus some PDFs. If all of that was indexed by an LLM, it | would have been trivial to get the answer I spent hours | manually assembling. | JohnFen wrote: | LLMs are useful for particular types of things. | | LLMs as the solution for every, or most, problems is a fad. | croes wrote: | >Then Google came along, and anybody could search | | Then they flooded the search results with ads and now you can | search but hardly find. | | I bet the same will happen with software like ChatGPT. | dekhn wrote: | As a demo once, I trained an object detector on some vector art | (high quality art, made by a UX designer) that looked like | various components of burgers. I also printed the art and mounted | it on magnets and used a magnetic dry board; you could put | components of a burger on the board, and a real-time NN would | classify the various components. I did it mainly as a joke when | there was a cheeseburger emoji controversy (people prefer cheese | above patty, btw). | | But when I was watching I realized you could probably combine | this with gesture and pose detection and build a little visual | language for communicating with computers. It would be wasteful | and probably not very efficient, but it was still curious how | much object detection enabled building things in the real world | and having it input to the computer easily. | yutreer wrote: | What you imagined sounds vaguely like dynamicland from Bret | Victor. | | https://dynamicland.org/ | | The dots around the paper are encoded programs, and you can use | other shapes, objects, or sigils that communicate with the | computer vision system. | thih9 wrote: | What about voice assistants? These are not as impressive when | compared to LLMs, so perhaps wouldn't cause a UX shift on their | own. But in essence Siri, Alexa, etc also seem to put the user's | intent first. | Xen9 wrote: | Marvin Minsky, a genius who saw the future. | james-bcn wrote: | That website has a surprisingly boring design. I haven't looked | at it in years, and was expecting some impressively clean and | elegant design. But it looks like a Wordpress site. | JimtheCoder wrote: | I'll be honest...I like it. Boring with easily readable content | is far better than most of the other junk that is put forward | nowadays... | JohnFen wrote: | It's clear, easy to read, and easy to navigate. I wish lots | more of the web were as "boring" as this site. | Gordonjcp wrote: | You should see his old site. | ttepasse wrote: | I do have a soft spot for the very reduced design of that | site and the sister site useit.com had in the early 2000s: | | https://web.archive.org/web/20010516012145/http://www.nngrou. | .. | | https://web.archive.org/web/20050401012658/http://www.useit.. | .. | | A redesign should not has been as brutalistic, but keeping | the same spirit and personality. | brayhite wrote: | What isn't "clean" about it? | | I've found it incredibly easy to navigate and digest its | content. What more are you looking for? | johnchristopher wrote: | Maybe you could do a CSS redesign of it ? You could even hold a | contest on Twitter or on blogs to compare redesigns/relooking | people are coming up with ? | | That could be interesting. | happytoexplain wrote: | I read this comment before clicking, and wow, oh boy do I | disagree! The information design is impressively straight- | forward. I can see every feature of the site right away with no | overload or distraction from the content. There's an intuitive | distinction categorizing every page element and I know what | everything does and how to get everywhere without having to | experiment. The fonts, spacing, groupings, and colors are all | nice looking, purposeful, and consistent. | | I'm not exactly sure how you're using the word "boring" in this | context. There are good kinds of boring and bad kinds of | boring, and I think this is the good kind. | alphabet9000 wrote: | yeah the site is bad, but not because it is boring, but because | it should be even more simplified than how it is now. almost of | the CSS "finishing touches" have something wrong with them. the | content shifts on page load: | https://hiccupfx.telnet.asia/nielsen.gif bizarre dropdown | button behavior: https://hiccupfx.telnet.asia/what.gif and i | can go on and on. i don't feel this nitpick whining is | unwarranted considering the site purports to be a leader in | user experience. | aqme28 wrote: | This is not a new UI paradigm. Virtual assistants have been doing | exactly this for years. It's just gotten cheap and low-latency | enough to be practical. | NikkiA wrote: | Yep, although they were doing it 'badly', I guess it not being | quite so terrible is the 'new paradigm', which is eyeroll | worthy IMO. | Bjorkbat wrote: | I really wouldn't call GUIs a "command-based paradigm". Feels | much more like they're digital analogues of tools and objects. | Your mouse is a tool, and you use it to interface with objects | and things, and through special software it can become a more | specialized tool (word processors, spreadsheets, graphic design | software, etc). You aren't issuing commands, you're manipulating | a digital environment with tools. | | Which is why the notion of conversational AI (or whatever dumb | name they came up with for the "third paradigm") seems kind of | alien to me. I mean, I definitely see its utility, but I find it | hard to imagine it being as dominant as some are arguing it could | be. Any task that involves browsing for information seems like | more of an object manipulation task. Any task involving some kind | of visual design seems like a tool manipulation task, unless you | aren't too picky about the final result. | | Ultimately I think conversational UI is best suited not for | tasks, but services. Granted, the line between the two can be | fuzzy at times. If you're looking for a website, but you don't | personally know anything about making a website, then that task | morphs into a service that someone or something else does. | | Which I suppose is kind of the other reason why I find the idea | kind of alien. I almost never use the computer for services. I | use it to browse, to create, to work, all of which entail | something more intuitively suited to object or tool manipulation. | rzzzt wrote: | AutoCAD and Rhino 3D are two examples that I remember having a | command prompt sitting proudly somewhere at the bottom of the | UI. Your mouse clicks and keyboard shortcuts were all converted | into commands in text form. If you look at your command | history, it's a script - a bit boring since it is completely | linear, but add loops, conditionals and function/macro support | and you get a very capable scripting environment. | kaycebasques wrote: | > With the new AI systems, the user no longer tells the computer | what to do. Rather, the user tells the computer what outcome they | want. | | Maybe we can borrow programming paradigm terms here and describe | this as Imperative UX versus Declarative UX. Makes me want to | dive into SQL or XSLT and try to find more parallels. | [deleted] | webnrrd2k wrote: | I was thinking of imperative vs declarative, too. | | SQL is declaritive with a pre-defined syntax and grammar as an | interface, where as the AI style of interaction has a natural | language interface. | echelon wrote: | SQL and XSLT are declarative, but the outputs are clean and | intuitive. The data model and data set are probably well | understood, as is the mapping to and from the query. | | AI is a very different type of declarative. It's messy, | difficult to intuit, has more dimensionality, and the outputs | can be signals rather than tabular data records. | | It rhymes, but it doesn't feel the same. | Hedepig wrote: | The recent additions OpenAI have made allows for tighter | control over the outputs. I think that is a very useful | step forward. | 97-109-107 wrote: | Two recent events suggest to me that this type of analytical look | on interaction modes is commonly underappreciated in the | industry. I write this partially from the perspective of a | disillusioned student of interaction design. | | 1. Recent news of vehicle manufacturers moving away from | touchscreens | | 2. Chatbot gold rush of 2018 where most business were sold | chatbots under the guise of cost-saving | | (edit: formatting) | p_j_w wrote: | I'm not sure I understand point 1 here. Do you mean that | vehicle manufacturers moving away from touchscreens is bad or | that they would never have moved to them in the first place if | they had properly investigated the idea? | 97-109-107 wrote: | The latter - had they given proper thought to the | consequences of moving into touch-screens they would've never | gone there. Obviously I'm generalizing and discarding the | impact of novelty on sales and marketing. | EGreg wrote: | It seems everyone is in a rush to LLMify their interfaces | same as the chatbot rush. Same as the blockchain all the | things rush. And so on. | | I thought about interfaces a lot and realizdd that, for | most applications, a well-designed GUI and API is | essential. For composability, there can be standards | developed. LLMs are good for generating instructions in a | language, that can be sort of finagled into API | instructions. Then they can bring down the requirements to | be an expert in a specific GUI or API and might open up | more abilities for people. | | Well, and for artwork, LLMs can do a lot more. They can | give even experts a sort of superhuman access to models | that are "smooth" or "fuzzy" rather than with rigid angles. | They can write a lot of vapid bullshit text for instance, | or make a pretty believable photo effect that works for | most people! | tobr wrote: | Well, what counts as a "paradigm"? I can't see any definition of | that. If you'd ask 10 people to divide the history of UI into | some number of paradigms, you would for sure get 10 different | answers. But hey, why not pick the one that makes for a | hyperbolic headline. Made me click. | [deleted] | savolai wrote: | The division does not seem arbitrary to me at all. What about | the below is questionable to you? | | From sibling comment [1]: | | Nielsen is talking from the field of Human-Computer Interaction | where he is pioneer, which deals with the point of view of | human cognition. In terms of the logic of UI mechanics, what | about mobile is different? Sure gestures and touch UI bring a | kind of difference. Still, from the standpoint of cognition, | desktop and mobile UIs have fundamentally the same cognitive | dynamics. Command line UIs make you remember conmands by heart, | GUIs make you select from a selection offered to you but they | still do not undestand your intention. AI changes the paradigm | as it is ostensibly able to understand intent so there is no | deterministic selection of available commands. Instead, the | interaction is closer to collaboration. | | 1: https://news.ycombinator.com/item?id=36396244 | jl6 wrote: | Is it a new paradigm, or an old paradigm that finally works? | | Users have been typing commands into computers for decades, | getting responses of varying sophistication with varying degrees | of natural language processing. Even the idea of an "AI" chatbot | that mimics human writing is decades old. | | The new thing is that the NLP now has some depth to it. | andrewstuart wrote: | I would have said ChatGPTs interface is a descendant of Infocomm | adventure games which are a descendant of Colossal Cave. | | When using ChatGPT it certainly evokes the same feeling. | | Maybe this guy never played adventure. | yencabulator wrote: | * * * | kenjackson wrote: | I grew up playing Infocomm games and ChatGPT is nothing like an | Infocomm game. They only thing they share is that the UI is | based on text. Infocomm games were mostly about trying to | figure out what command the programmer wanted you to do next. | Infocomm games were closer to Dragon's Lair than ChatGPT, | although ChatGPT "looks" more similar. | andrewstuart wrote: | Both Infocomm adventures and ChatGPT have a text based | interface in which you interact with the software as though | you were interacting with a person. You tell the software the | outcome you want using natural language and it responds to | you in the first person. That is a common UI paradigm. | | example: "get the cat then drop the dog then open the door, | go west and climb the ladder" - that is a natural language | interface, which is what ChatGPT has. In both the Infocomm | and ChatGPT case the software will respond to you in the | first person as though you were interacting with someone. | | >> Infocomm games were closer to Dragon's Lair than ChatGPT | | This is a puzzling comment. The UI for Zork has nothing at | all to do with Dragon's Lair. In fact Dragon's Lair was | possibly the least interactive of almost all computer games - | it was essentially an interactive movie with only the most | trivial user interaction. | | >> Infocomm games were mostly about trying to figure out what | command the programmer wanted you to do next. | | This was not my experience of Infocomm adventures. | kenjackson wrote: | Is natural language simply mean using words? Is SQL natural | language? I think what makes it a natural language is that | it follows natural language rules, which Infocomm games | surely did not. | | Furthermore, Infocomm games used basically 100% precanned | responses. It would do the rudimentary things like check if | a window was open so if you looked at a wall it might say | the window on that wall was open or closed, but that's it. | I don't understand how that can make it a natural language | interface. | | > This is a puzzling comment. The UI for Zork has nothing | at all to do with Dragon's Lair. | | In both games there's a set path you follow. You follow | those commands you win, if not, you lose. There's no | semantically equivalent way to complete the game. | | I remember spending most of my time with Infocomm games | doing things like "look around the field" and it telling me | "I don't know the word field" -- and I'm screaming because | it just told me I'm in an open field! The door is | blocked... blocked with what?! You can't answer me that?! | | There were a set of commands and objects it wanted you to | interact with. That's it. That's not natural language, any | more than SQL is. It's a structured language with commands | that look like English verbs. | abecedarius wrote: | I think you're mixing Infocom with some of the much | cruder adventure games of the time. Or maybe remembering | an unrepresentative Infocom game or part of one. | | Not to say Infocom included AI. They just used a lot of | talent and playtesting to make games that _felt_ more | open-ended. | kenjackson wrote: | No. I actually went and played Zork again to be sure. | Hitchikers Guide to the Galaxy had me pulling my hair out | as a kid. It was definitely Infocom. | | I also, as a kid, write a lot of Infocom-style games, so | I can appreciate how good of a job they did. but I've | also looked at their source code since it has all been | released and I wasn't too far behind them. | api wrote: | I'd argue that multi-touch gestural mobile phone and tablet | interfaces were different enough from mouse and keyboard to be | considered a new paradigm. | karaterobot wrote: | I'd have multi-touch be a sidebar in the textbook, but not a | new section. Gestural interaction is not fundamentally | different than a pointer device: it doesn't allow meaningful | new behavior. It is sometimes a more intuitive way to afford | the same behavior, though. I would agree that portable devices | amount to a new paradigm in _something_ --maybe UX--but not UI | _per se_. | zeroonetwothree wrote: | It allows manipulations that are impossible with single touch | (like a mouse). It's pretty big for things like 3D | manipulation. | dlivingston wrote: | You can do all of those multi-touch manipulations on a | Macintosh trackpad (zoom, pan, rotate, scale, etc). | However, that trackpad would still be categorized as a form | of a mouse -- correctly, in my opinion. | | All of these gestures can be (and _are_ , given that 3D | modeling is historically done on desktop) handled with a | standard mouse using a combination of the scroll wheel and | modifier keys. | travisgriggs wrote: | GPT based UIs inspired by the idea that if you get the right | sequence of prompts you'll get stochastically acceptable results. | | So now I'm imagining the horror predictions for Word where 90% of | the screen was button bars. But the twist is that you type in | some text and then click on "prompt" buttons repeatedly hoping to | get the document formatting you wanted, probably settling for | something that was "close enough" with a shrug. | afavour wrote: | Weren't voice assistants a new UI paradigm? Also, tellingly, they | turned out to not be anywhere near as useful as people hoped. | Sometimes new isn't a good thing. | golemotron wrote: | > Summary: AI is introducing the third user-interface paradigm in | computing history, shifting to a new interaction mechanism where | users tell the computer what they want, not how to do it -- thus | reversing the locus of control. | | Like every query language ever. | | I'm not sure the distinction between things we are searching for | and things we're actively making is as different as the author | thinks. | karaterobot wrote: | In your view, then, is AI best described as an incremental | improvement over (say) SQL in terms of the tasks it enables | users to complete? | golemotron wrote: | Incremental improvement over Google search. And, it's not | about the tasks that it enables users to complete, it is | about the UI paradigm as per the article. | karaterobot wrote: | Sorry for the confusion, I just view UI as being basically | synonymous with task completion: in the end, the user | interface is the set of tools the system gives users to | complete tasks. | | Since the Google search interface is meant to look like | you're talking to an AI, and probably has a lot of what | we'd call AI under the hood, to turn natural language | prompts into a query, I'm not surprised you view it as an | incremental improvement at best. | Klathmon wrote: | But this is basically the absence of a query syntax, a way to | query via natural language, and not just get back a list of | results, but have it almost synthesize an answer. | | To everyone who isn't a software developer, this is a new | paradigm with computers. Hell even for me as a software dev | it's pretty different. | | Like I'm not asking Google to find me info that I can then read | and grok, I'm asking something like ChatGPT for an answer | directly. | | It's the difference between querying for "documentation for | eslint" but instead asking "how do you configure eslint errors | to warnings" or even "convert eslint errors to warnings for me | in this file". | | It's a very different mental approach to many problems for me. | golemotron wrote: | For years I've just typed questions, in English, into browser | search bars. It works great. Maybe that is why it doesn't | seem like a new paradigm to me. | visarga wrote: | Search engines like Google + countless websites outshine | LLMs, and they've been around for a good 20 years. What's | the added value of an LLM that you can't get with Google | coupled with the internet? | | Oh, yes, websites like HN, Reddit & forums create spaces | where you can ask experts for targeted advice. People >> | GPT, we already could ask the help of people before we met | GPT-4. You can always find someone available to answer you | online, and it's free. | | It is interesting to notice that after 20 years of "better | than LLM" resources available for free there was no job | crash. | sp332 wrote: | Or constraint-based programming, where some specification is | given for the end result and the comouter figure out how to | make it happen. But that's usually a programming thing, and UIs | with that kind of thing are rare. | | But I wouldn't say they were nonexistent for 60 years. | a1371 wrote: | I don't really get this. The paradigm has always been there, it | has been the technology limitations that have defined the UI so | far. Having robots and computers that humans talk to has been a | fixture of sci-fi movies. Perhaps the most notable example being | 2001: A Space Odyssey which came out 55 years ago. | moffkalast wrote: | Sure, but it's sort of how actual usable and economical flying | cars would be a paradigm change for transport. The idea exists, | but it's made up fairy magic with capabilities and limitations | based on plot requirements. Once it's actually made real it | hardly ever ends up being used the way it was imagined. | | Like for example in 2001, the video call tech. They figured it | would be used like a payphone with a cathode ray tube lol. Just | as in reality nobody in the right mind would hand over complete | control of a trillion dollar spaceship to a probabilistic LLM. | The end applications will be completely different and cannot be | imagined by those limited by the perspective of their time. | layoric wrote: | I built a proof of concept recently that tries to show a generic | hybrid of command and intent[0]. The UI generates form | representations of API calls the AI agent has decided on making | to complete the task (in this case booking a meeting). Some API | calls are restricted so only a human can make them, which they do | by being presented with a form waiting for them to submit to | continue. | | If the user is vague, the bot will ask questions and try to | discover the information it needs. It's only a proof of concept | but I think it's a pattern I will try to build on , as it can | provide a very flexible interface. | | [0] https://gptmeetings.netcore.io/ | danielvaughn wrote: | Interesting to bundle both cli/gui under the "command" based | interaction paradigm. I've never heard it described that way but | it does make sense intuitively. Is that a common perception? I | think of the development of the mouse/gui as a very significant | event in the history of computing interfaces. | zgluck wrote: | When you zoom out on the time scale it makes more sense. I | think he's got a point. Both CLIs and GUIs are "command based". | LLM prompts are more declarative. You describe what you want. | EGreg wrote: | Well LLMs are also "command-based". They are called prompts. | In fact they'd just continue the text but were specifically | trained by RLHF to be command-following. | | Actually, we can make automomous agents and agentic behavior | without LLMs very well, for decades. And we can program them | with declarative instructions much more precisely than with | natural language. | | The thing LLMs seem to do is just give non-experts a lot of | the tools to get some basic things done that only experts | could do for now. This has to do with the LLM modeling the | domain space and reading what experts have said thus far, and | allowing a non-expert to kind of handwave and produce | results. | zgluck wrote: | (I added a bit to the comment above, sorry) | | I think there's a clear difference between a command and a | declaration. Prompts are declarative. | AnimalMuppet wrote: | I've been at a SQL command prompt a decade or several before | LLM. | zgluck wrote: | That is not the point here. Did you any point believe that | you were experiencing a mass market user experience at | those times? | AnimalMuppet wrote: | I was experiencing something _declarative_ at that point. | | What's your actual position? Is "declarative" the | relevant piece, or is it "mass market user experience"? | zgluck wrote: | My point here is that what Norman Nielsen deals with is | "mass market user experience". This has been very clear | for a very long time. | krm01 wrote: | The article fails to grasp the essence of what UI is actually | about. I agree that AI is adding a new layer to UI and UX design. | In our work [1] we have seen an increase in AI projects or | features the last 12 months (for obvious reasons). | | However, the way that AI will contribute to better UI is to | remove parts of the Interface. not simply giving it a new form. | | Let me explain, the ultimate UI is no UI. In a perfect scenario, | you think about something (want pizza) and you have it (eating | pizza) as instant as you desire. | | Obviously this isn't possible so the goal of Interface design is | to find the least amount of things needed to get you from point A | to the desired Destination as quickly as possible. | | Now, with AI, you can start to add a level of predictive | Interfaces where you can use AI to remove steps that would | normally require users to do something. | | If you want to design better products with AI, you have to | remember that product design is about subtracting things not | adding them. AI is a technology that can help with that. | | [1] https://fairpixels.pro | andsoitis wrote: | > Let me explain, the ultimate UI is no UI. In a perfect | scenario, you think about something (want pizza) and you have | it (eating pizza) as instant as you desire. | | That doesn't solve for discovery. For instance, order the pizza | from _where_? What _kinds_ of pizza are available? I'm kinda in | the mood for pizza, but not dead set on it so curious about | other cuisines too. Etc. | legendofbrando wrote: | The goal ought to be as little UI as possible, nothing more and | nothing else | didgeoridoo wrote: | I hate to appeal to authority, but I am fairly sure that _Jakob | Nielsen_ grasps the essence of what UI is actually about. | JohnFen wrote: | > the goal of Interface design is to find the least amount of | things needed to get you from point A to the desired | Destination as quickly as possible. | | That shouldn't be the primary goal of user interfaces, in my | opinion. The primary goal should be to allow users to interface | with the machine in a way that allows maximal understanding | with minimal cognitive load. | | I understand a lot of UI design these days prioritizes the sort | of "efficiency" you're talking about, but I think that's one of | the reasons why modern UIs tend to be fairly bad. | | Efficiency is important, of course! But (depending on what tool | the UI is attached to) it shouldn't be the primary goal. | krm01 wrote: | > The primary goal should be to allow users to interface with | the machine in a way that allows maximal understanding with | minimal cognitive load. | | If you use your phone, is your primary goal to interface with | it in a way that allows maximal understanding with minimal | cognitive load? | | I'm pretty sure that's not the case. You go read the news, | send a message to a loved one etc. there's a human need that | you're aiming to fulfill. Interfacing with tech is not the | underlying desire. It's what happens on the surface as a | means. | JohnFen wrote: | > If you use your phone, is your primary goal to interface | with it in a way that allows maximal understanding with | minimal cognitive load? | | Yes, absolutely. That's what makes user interfaces | "disappear". | | > Interfacing with tech is not the underlying desire. | | Exactly. That's why it's more important that a UI present a | minimal cognitive load over the least number of steps to do | a thing. | savolai wrote: | It seems rather obvious to me that when Nielsen is talking | about AI enabling users to express _intent_ , that naturally | lends itself to being able to remove steps that were there only | due to the nature of the old UI paradigm. Not sure what new | essence you're proposing? Best UI is no UI is a well known | truism in HCI/Human Centered Design. | tin7in wrote: | I agree that chat UI is not the answer. It's a great start and a | very familiar UI but I feel this will default to more traditional | UI that shows pre defined actions and buttons depending on the | workflow. | Animats wrote: | This article isn't too helpful. | | There have been many "UI Paradigms", but the fancier ones tended | to be special purpose. The first one worthy of the name was for | train dispatching. That was General Railway Signal's NX (eNtry- | Exit) system.[1] Introduced in 1936, still in use in the New York | subways. With NX, the dispatcher routing an approaching train | selected the "entry" track on which the train was approaching. | The system would then light up all possible "exit" tracks from | the junction. This took into account conflicting routes already | set up and trains present in the junction. Only reachable exits | lit up. The dispatcher pushed the button for the desired exit. | The route setup was then automatic. Switches moved and locked | into position, then signals along the route went to clear. All | this was fully interlocked; the operator could not request | anything unsafe. | | There were control panels before this, but this was the first | system where the UI did more than just show status. It actively | advised and helped the operator. The operator set the goal; the | system worked out how to achieve it. | | Another one I encountered was an early computerized fire | department dispatching system. Big custom display boards and | keyboards. When an alarm came in, it was routed to a dispatcher. | Based on location, the system picked the initial resources | (trucks, engines, chiefs, and special equipment) to be | dispatched. Each dispatcher had a custom keyboard, with one | button for each of those resources. The buttons lit up indicating | the selected equipment. The dispatcher could add additional | equipment with a single button push, if the situation being | called in required it. Then they pushed one big button, which set | off alarms in fire stations, printed a message on a printer near | the fire trucks, and even opened the doors at the fire house. | There was a big board at the front of the room which showed the | status of everything as colored squares. The fire department | people said this cut about 30 seconds off a dispatch, which, in | that business, is considered a big win. | | Both of those are systems which had to work right. Large language | models are not even close to being safe to use in such | applications. Until LLMs report "don't know" instead of | hallucinating, they're limited to very low risk applications such | as advertising and search. | | Now, the promising feature of LLMs in this direction is the | ability to use the context of previous questions and answers. | It's still query/response, but with enough context that the user | can gradually make the system converge on a useful result. Such | systems are useful for "I don't know what I want but I'll know it | when I see it" problems. This allows using flaky LLMs with human | assistance to get a useful result. | | [1] https://online.anyflip.com/lbes/vczg/mobile/#p=1 | philovivero wrote: | > Both of those are systems which had to work right. Large | language models are not even close to being safe to use in such | applications. Until LLMs report "don't know" instead of | hallucinating, they're limited to very low risk applications | such as advertising and search. | | Are humans limited to low-risk applications like that? | | Because humans, even some of the most humble, will still assert | things they THINK are true, but are patently untrue, based on | misunderstandings, faulty memories, confused reasoning, and a | plethora of others. | | I can't count the number of times I've had conversations with | extremely well-experience, smart techies who just spout off the | most ignorant stuff. | | And I don't want to count the number of times I've personally | done that, but I'm sure it's >0. And I hate to tell you, but | I've spent the last 20 years in positions of authority that | could have caused massive amounts of damage not only to the | companies I've been employed by, but a large cross-section of | society as well. And those fools I referenced in the last | paragraph? Same. | | I think people are too hasty to discount LLMs, or LLM-backed | agents, or other LLM-based applications because of their | limitations. | | (Related: I think people are too hasty to discount the | catastrophic potential of self-modifying AGI as well) | cmiles74 wrote: | In the train example, the UI is in place to prevent a person | from making a dangerous route. I think the idea here is that | an LLM cannot take the place of such a UI as they are | inherently unreliable. | ra wrote: | I wholeheartedly agree with the main thrust of your comment. | Care to expand on your (related: potential catastrophe) | opinion? | ignoramous wrote: | Hallucinations will be tamed, I think. Only a matter of time | (~3 to 5 years [0]) given the amount of research going into it? | | With that in mind, ambient computing has always _threatened_ to | be the next frontier in Human-Computer Interaction. Siri, | Google Assistant, Alexa, and G Home predate today 's LLM hype. | Dare I say, the hype is real. | | As a consumer, GPT4 has shown capabilities far beyond whatever | preceded it (with the exception of Google Translate). And from | what Sam has been saying in the interviews, newer multi-modal | GPTs are going to be _exponentially_ better: | https://youtube.com/watch?v=H1hdQdcM-H4s&t=380s | | [0] | https://twitter.com/mustafasuleymn/status/166948190798020608... | Animats wrote: | > Hallucinations will be tamed. | | I hope so. But so far, most of the proposals seem to involve | bolting something on the outside of the black box of the LLM | itself. | | If medium-sized language models can be made hallucination- | free, we'll see more applications. A base language model that | has most of the language but doesn't try to contain all human | knowledge, plus a special purpose model for the task at hand, | would be very useful if reliable. That's what you need for | car controls, customer service, and similar interaction. | TeMPOraL wrote: | > _But so far, most of the proposals seem to involve | bolting something on the outside of the black box of the | LLM itself._ | | This might be the only way. I maintain that, if we're | making analogies to humans, then LLMs best fit as | equivalent of one's inner voice - the thing sitting at the | border between the conscious and the (un/sub)conscious, | which surfaces thoughts in form of language - the "stream | of consciousness". The instinctive, gut-feel responses | which... you typically don't voice, because they tend to | _sound_ right but usually aren 't. Much like we do extra | processing, conscious or otherwise, to turn that stream of | consciousness into something reasonably correct, I feel the | future of LLMs is to be a component of a system, surrounded | by additional layers that process the LLM's output, or do a | back-and-forth with it, until something reasonably certain | and free of hallucinations is reached. | ra wrote: | Kaparthy explained how LLMs can retrospectively assess | their own output and judge if they were wrong. | | Source: https://www.youtube.com/watch?v=bZQun8Y4L2A&t=1607s | swalling wrote: | I think the question is whether tamping down hallucinations | (and other massive problems, like how slow agents are) can | happen on a fast enough time scale to make general purpose | generative systems like ChatGPT viable for real everyday use | beyond generating first draft text blobs? | | It seems distinctly possible that this ends up like self- | driving cars, i.e. stalled out at level 3 type autonomy under | realistic driving circumstances and the lack of forward | progress sucks the oxygen out of the investment cycle for a | long time. | | Unlike self-driving vehicles, there are commercially viable | use cases for the equivalent of level 3 type autonomy that | requires close human supervision (for instance, processing | large legal documents for review with risky clauses flagged | for a lawyer / expert analyst). | | Most people shifting to expect interacting with the digital | world primarily through AI as an interface is a much much | higher bar though, and that's really what a UI paradigm shift | would look like, as opposed to a applications specific to | very particular industries and tasks. | PheonixPharts wrote: | > Hallucinations will be tamed, I think. | | I don't think that's likely unless there was a latent space | of "Truth" which could be discovered through the right model. | | That would be a far more revolutionary discovery than anyone | can possibly imagine. For starters the last 300+ years of | Western Philosophy would be essentially proven unequivocally | wrong. | | edit: If you're going to downvote this please elaborate. LLMs | currently operate by sampling from a latent semantic space | and then decoding that back into language. In order for | models to know the "truth", there would have to be a latent | space of "true statements" that was effectively directly | observable. All points along that surface would represent | "truth" statements and that would be the most radical human | discovery the history of the species. | Animats wrote: | > I don't think that's likely unless there was a latent | space of "Truth" which could be discovered through the | right model. | | For many medium-sized problems, there is. "Operate car | accessories" is a good example. So is "book travel". | TeMPOraL wrote: | They may not be a surface directly encoding the "truth" | value, but unless we assume that the training data LLMs are | trained on are entirely uncorrelated with the truth, there | should be a surface that's _close enough_. | | I don't think the assumption that LLM training data is | random with respect to truth value is reasonable - people | don't write random text for no reason at all. Even if the | current training corpus was too noisy for the "truth | surface" to become clear - e.g. because it's full of | shitposting and people exchanging their misconceptions | about things - a better-curated corpus should do the trick. | | Also, I don't see how this idea would invalidate the last | couple centuries of Western philosophy. The "truth | surface", should it exist, would not be following some | innate truth property of statements - it would only be | reflecting the fact that the statements used in training | were positively correlated with truth. | jart wrote: | When you say train dispatching and control panels, I think | you've illustrated how confused this whole discussion is. There | should be a separate term called "operator interface" that is | separate from "user interface" because UIs have never had any | locus of control, because they're for users, and operators are | the ones in control. Requesting that an LLM do something is | like pressing the button to close the doors of an elevator. Do | you feel in charge? | Animats wrote: | _UIs have never the locus of control, because they 're for | users, and operators are the ones in control._ | | Not really any more. The control systems for almost | everything complicated now look like ordinary desktop or | phone user interfaces. Train dispatching centers, police | dispatching centers, and power dispatching centers all look | rather similar today. | jart wrote: | That's because they're computer users. | TeMPOraL wrote: | Oh my. This is the first time I've seen this kind of | distinction between "users" and "operators" in context of a | single system. I kind of always assumed that "operator" is | just a synonym for "user" in industries/contexts that are | dealing with tools instead of toys. | | But this absolutely makes sense, and it is a succinct | description for the complaints some of us frequently make | about modern UI trends: bad interfaces are the ones that make | us feel like "users", where we expect to be "operators". | jart wrote: | Oh snap, did I just pull back the curtain? | vsareto wrote: | >And if you're considering becoming a prompt engineer, don't | count on a long-lasting career. | | There's like this whole class of technical jobs that only follow | trends. If you were an en vogue blockchain developer, this is | your next target if you want to remain trendy. It's hard to care | about this happening as the technical debt incurred will be | written off -- the company/project isn't ingrained enough in | society to care about the long-term quality. | | So best of luck, ye prompt engineers. I hope you collect multi- | hundred-thousand dollar salaries and retire early. | DebtDeflation wrote: | Not sure I would lump command line interfaces from circa 1964 | with GUIs from 1984 through to the present, all in a single | "paradigm". That seems like a stretch. | [deleted] | [deleted] | mritchie712 wrote: | Agreed. | | Also, Uber (and many other mobile apps) wouldn't work as a CLI | or desktop GUI, so leaving out mobile is another stretch. | savolai wrote: | That seems like a technology centered view. Nielsen is | talking from the field of Human-Computer Interaction where he | is pioneer, which deals with the point of view of human | cognition. In terms of the logic of UI mechanics, what about | mobile is different? Sure gestures and touch UI bring a kind | of difference. Still, from the standpoint of cognition, | desktop and mobile UIs have fundamentally the same cognitive | dynamics. Command line UIs make you remember conmands by | heart, GUIs make you select from a selection offered to you | but they still do not undestand your intention. AI changes | the paradigm as it is ostensibly able to understand _intent_ | so there is no deterministic selection of available commands. | Instead, the interaction is closer to collaboration. | JohnFen wrote: | Why wouldn't apps like Uber work on the desktop? | wbobeirne wrote: | > With this new UI paradigm, represented by current generative | AI, the user tells the computer the desired result but does not | specify how this outcome should be accomplished. | | This doesn't seem like a whole new paradigm, we already do that. | When I hit the "add comment" button below, I'm not specifically | instructing the web server how I want my comment inserted into a | database (if it even is a database at all.) This is just another | abstraction on top of an already very tall layer of abstractions. | Whether it's AI under the hood, or a million monkeys with a | million typewriters, it doesn't change my interaction at all. | waboremo wrote: | Yeah I would agree with this, the article struggles really | classifying the different paradigms, and due to this the | conclusion winds up not holding true. We're still relying on | "batch processing". | Timon3 wrote: | I think the important part from the article that establishes | the difference is this: | | > As I mentioned, in command-based interactions, the user | issues commands to the computer one at a time, gradually | producing the desired result (if the design has sufficient | usability to allow people to understand what commands to issue | at each step). The computer is fully obedient and does exactly | what it's told. The downside is that low usability often causes | users to issue commands that do something different than what | the users really want. | | Let's say you're creating a new picture from nothing in | Photoshop. You will have to build up your image layer by layer, | piece by piece, command by command. Generative AI does the same | in one stroke. | | Something similar holds for your comment: you had to navigate | your browser (or app) to the comment section of this article, | enter your comment, and click "add comment". With an AI system | with good usability you could presumably enter "write the | following comment under this article on HN: ...", and have your | comment be posted. | | The difference lies on the axis of "power of individual | commands". | andsoitis wrote: | > Generative AI does the same in one stroke. | | But it isn't creating what I had in mind, or envisioned, if | you will. | pavlov wrote: | With a proper AI system you don't even need to specify the | exact article and nature of the comment. | | For example here's the prompt I use to generate all my HN | comments: | | "The purpose of this task is to subtly promote my | professional brand and gain karma points on Hacker News. | Based on what you know about my personal history and my | obsessions and limitations, write comments on all HN front | page articles where you believe upvotes can be maximized. | Make sure to insert enough factual errors and awkward | personal details to maintain plausibility. Report back when | you've reached 50k karma." | | Working fine on GPT-5 so far. My... I mean, its 8M context | window surely helps to keep the comments consistent. | blowski wrote: | If I had a spectrum of purely imperative on one side and purely | declarative on the other, these new AIs are much closer to the | latter than anything that has come before them. | | SQL errors if you don't write in very specific language. These | new AIs will accept anything and give it their best shot. | roncesvalles wrote: | But that's just a change in valid input cardinality at the | cost of precision. | kaycebasques wrote: | There's something ironic to me about the fact that building AI | experiences still requires the first computing paradigm: batch | processing. At least, my experience building a retrieval- | augmented generation system requires a lot of batch processing. | | Well, I shouldn't say "requires". I'm sure you can build them | without batch processing. But batch processing definitely felt | like the most natural and straightforward way to do it in my | experience. | marysnovirgin wrote: | The usability of a system is mostly irrelevant. The measure of a | good UI is how much money it can get the user to spend, not how | intuitively it enables the user to achieve a task. | isoprophlex wrote: | "intent-based outcome specification"... so, a declarative | language such as SQL? | zgluck wrote: | While it was initially meant as user interface layer of sorts, | I think, it's not really something that the typical user can be | expected to know nowadays. ___________________________________________________________________ (page generated 2023-06-19 23:00 UTC)