[HN Gopher] Demo of =GPT3() as a spreadsheet feature ___________________________________________________________________ Demo of =GPT3() as a spreadsheet feature Author : drewp Score : 253 points Date : 2022-10-31 19:45 UTC (2 days ago) (HTM) web link (twitter.com) (TXT) w3m dump (twitter.com) | ninefathom wrote: | Fascinating and terrifying all at once. | | Queue Fry "I'm scare-roused" meme... | swyx wrote: | also previously from 2020 | https://twitter.com/pavtalk/status/1285410751092416513?s=20&... | orblivion wrote: | Could we hook GPT3 up to our dating apps? On both sides. That way | we can just let the computers do the small talk and if they hit | it off we can meet. | wesleyyue wrote: | Also check out https://usedouble.com (YC W23) if you're | interested in using something like this today. | | Note: I'm the founder :) Happy to answer any questions. | | Reply below with some sample data/problem and I'll reply with a | demo to see if we can solve it out of the box! | trialskid86 wrote: | Just signed up. How long is the wait list? | conductr wrote: | > Get this AI tool for FREE. Next waitlist closes in: | | > 0 day 7 hour 31 min 42 sec | | I've never seen rolling waitlists, it's kind of strange tbh | pbmango wrote: | There is a huge potential for language models to get close to | messy text problems (many of which are in Excel and Sheet). I am | the founder of promptloop.com - the author of this tweet has been | an early user. | | The challenge to making something like this, or Co-pilot / | Ghostwrite, work well is about meeting users where they are. | Spreadsheet users dont want to deal with API keys or know what | temperature is - but anyone (like this tweet) can set up direct | API use with generic models in 10 minutes. This document has all | the code to do so ;). [1] | | For non-engineers - or folks who need a reliable and familiar | syntax to use at scale and across their org - promptloop [2] is | the best way to do that. All comments in here are great though. | We have been live with users since the summer - no waitlist. And | as a note - despite the name "prompt engineering" has almost | nothing to do with making this work at scale. | | [1] | https://docs.google.com/spreadsheets/d/1lSpiz2dIswCXGIQfE69d... | [2] https://www.promptloop.com/ | elil17 wrote: | Any plans to bring this to Excel? I would love to recommend | this to folks at my company but we aren't allowed to use | G-Suite. | pbmango wrote: | Yes - Not open access yet but drop me an email! | tonmoy wrote: | The tasks on the first sheet is easily accomplished by flash fill | in MS Excel and I suspect less prone to error. Not sure why flash | fill is not more popular | baxtr wrote: | What is flash fill? I have worked a lot on Excel sheets and | still haven't heard anything about it. | localhost wrote: | Flash fill is an implementation of program synthesis, a | technique invented by Sumit Gulwani formerly of Microsoft | Research. Here's a paper that explains more about how it | works [1]. It's not a very discoverable feature of Excel | though [2] | | [1] https://arxiv.org/pdf/1703.03539.pdf [2] | https://support.microsoft.com/en-us/office/using-flash- | fill-... | [deleted] | brassattax wrote: | ctrl+e | Havoc wrote: | Of all the places spreadsheet is probably the one place you don't | want AI generated content. Half the time it's financial info so | sorta correct simply isn't good enough | cdrini wrote: | Spreadsheets are used for _waaaay_ more than just finances. I | don't think it's anywhere near 50% finances. I can't recall | where, but I saw a study from I think the 90s saying most of | the spreadsheets they found were being used as Todo lists. | | Maybe like 1 in my past 2y of many, many spreadsheets has been | financing related. I think you might be overgeneralizing to an | ungeneralizeably large group -- the set of all human | spreadsheets. | unnah wrote: | Most real-world spreadsheets contained significant errors in | this 2005 review: | https://www.researchgate.net/publication/228662532_What_We_K... | | Is there any reason to think the situation has substantially | improved since then? | mrguyorama wrote: | This is not a good excuse to _actively and knowingly make it | worse_ | anon25783 wrote: | I was going to say: This can only end _well_ for the economy... | /s | dylan604 wrote: | Let's see if we can tell what data was used to train the | model by watching where the money starts to be moved around | into offshore accounts and what not. Was the model trained on | the data dumps of from those "off shore" banks recently-ish | leaked. | contravariant wrote: | When I see something is in a spreadsheet I immediately assume | there are at least 3 things wrong with the data, 1 of which is | obvious. | Imnimo wrote: | The ability of language models to do zero-shot tasks like this is | cool and all, but there is no way you should actually be doing | something like this on data you care about. Like think about how | much compute is going into trying to autofill a handful of zip | codes, and you're still getting a bunch of them wrong. | dylan604 wrote: | I've used/added the USPS API into a system and it took | practically no time at all to do it. I'm guessing that is | significantly less time than building an AI tool. What's worse | is that the thing that takes the least time to implement | actually provides good data. | Petersipoi wrote: | Obviously adding the USPS API to your tool would take less | time than building an AI tool. But the AI tool is infinitely | more powerful for almost anything other than dealing with | addresses. | | So the question isn't which one you can add to your tool | faster. The question is, if I already have this AI tool | setup, is it worth setting up the USPS API to go from 95% | accuracy to 99.9% accuracy. For countless applications, it | wouldn't be. Obviously if you need to scale and need to | ensure accuracy, it's a different story. | dylan604 wrote: | > For countless applications, it wouldn't be. | | If having something like the zip code not be accurate, then | what's the point of having the zip code in your data? | People writing code to store zips/addresses are doing it to | not not be able to send/verify/etc. They are doing so that | the can, but if the data is wrong then they can't. | | What countless applications that ask for a zip code/mailing | address and don't _need_ it to be accurate? I would then | say that any that you name would actually not need the data | in the first place. If your hoover it up just to sell | later, wouldn 't it be worth more to be valid? So again, | I'm right back to why do you need it? | mritchie712 wrote: | yeah, determining the zip for an address is a really bad | example. | | Better one would be "based on these three columns, generate a | cold outbound email for the person..." | | it would suck to be on the receiving end of those, but the use | case makes much more sense. | layer8 wrote: | Yeah, I think energy efficiency considerations will become | important at some point. Or at least they should. | ajmurmann wrote: | They only will if we price in the true cost of energy | including negative externalities | pbmango wrote: | Microsoft and Google both have excellent formulas for dates - | and are getting there for addresses. Right now - the most | useful things you can accomplish in sheets center around what | the underlying models are good at - general inference and | generation based on text. Anything needing exact outputs should | be a numeric formula or programmatic. | | Non-exact outputs are actually a feature and not a bug for | other use cases - but this takes a bit of use to really see. | ACV001 wrote: | This particular example is an inadeqate application of AI. This | is static data which can be looked up in a table (at least zip | code). | a1371 wrote: | This is compared to inadequate application of humans. it is not | competing with people who know how to do regex and string | parsing. It is for the people who put an office assistant to | the task. It is better to inadequately apply AI here as opposed | to inadequately apply a human who probably has more fun things | to do. | krossitalk wrote: | What about subtle formatting differences (Country, Territory, | Postal code is the norm. Doesn't have to be.). What if we | applied this to hand written addresses? (Adding an OCR | component). | | I'm sure the USPS is already doing this and more, and if not, | there's probably some AI jobs lined up for it :) | lifthrasiir wrote: | Yes, USPS has Remote Encoding Centers (REC) to handle | handwritten address that can't be recognized with OCR. So AI | is already there, just that humans are for harder and tedious | jobs ;-) | appletrotter wrote: | > So AI is already there, just that humans are for harder | and tedious jobs ;-) | | Increasingly less so ;-) | breck wrote: | If you do a startup for this please email me wire instructions. | chmod775 wrote: | This seems to be doing much worse than existing solutions: Google | Maps probably wouldn't have gotten quite as many wrong if you | just pasted those addresses into the search bar. However it could | be interesting as a last shot if parsing the input failed using | any other way. | | "I tried parsing your messy input. Here's what I came up with. | Please make sure it's correct then proceed with the checkout." | mike256 wrote: | Do I understand that correctly? When I have to create a | spreadsheet like this, there are 2 options. Option 1 I write a | table zipcode to state and use this table to generate my column. | If I carefully check my table my spreadsheet would be okay. | Option 2 I ask GPT3 to do my work. But I have to check the whole | spreadsheet for errors. | giarc wrote: | I dealt with something similar. I was creating a large | directory of childcare centres in Canada. I had thousands of | listings with a url but no email address. I created a | Mechanical Turk job to ask turkers to go to website and find an | email address. Many came back with email addresses like | admin@<<actualURL>>.com. After checking a few, I realized that | the turkers were just guessing that admin@ would work and I'd | approve their work. I ended up having to double check all the | work. | xvector wrote: | I wonder if those workers can be reported and fired. | csunbird wrote: | I mean, depending on how the OP phrased the work to be | done, they probably did valid work. | [deleted] | Quarrelsome wrote: | "lack of dark mode" should be "features" not "usability"? | camtarn wrote: | Similarly, "Not sure what to do with the button" is clearly a | usability issue, not features. | | And for the second Kindle review, it summarized one point from | the actual review, then completely made up two additional | points! | | Really impressive Sheets extension, but you'd have to be so | careful what you applied this to. | adrianmonk wrote: | > _completely made up two additional points_ | | I wonder if this means the AI is dumb or that the AI is smart | enough to notice that humans just make shit up sometimes, | like when they're not reading carefully or when they need | filler. | jedberg wrote: | If anything it's a lack of a usability feature. Sounds like | both would be right. | [deleted] | renewiltord wrote: | This is terrific stuff, honestly. I could see an Airtable | integration being really quite useful. There were lots of times | when I will run some quick scraping, some cleaning up via an | Upworker, and then join against something else. | | Here volume matters, and all misses are just lost data which I'm | fine with. The general purpose nature of the tool makes it | tremendous. There was a time when I would have easily paid $0.05 | / query for this. The only problem with the spreadsheet setting | is that I don't want it to repeatedly execute and charge me so | I'll be forced to use `=GPT3()` and then copy-paste "as values" | back into the same place which is annoying. | A4ET8a8uTh0 wrote: | Was anyone able to test if the Airtable implementation works as | well as the twitter 'ad'? | scanr wrote: | That's amazing. It does rely on a level of comfort with a fuzzy | error budget. | vntok wrote: | About 20% of the generated postcodes are absurdly wrong. | gbro3n wrote: | The most sensible use for AI that I can see at this time is for | supporting humans in their work, but _only_ where the system is | set up so that the human has to do the work first, with the AI | system looking for possible errors. For example the human drives | the car, and the AI brakes when it senses dangerous conditions | ahead, or the human screens test results for evidence of cancer | and the AI flags where it disagrees so that the human might take | another look. The opposite scenario with AI doing the work and | humans checking for errors as is the case here will lead to | humans being over reliant on less than perfect systems and | producing outcomes with high rates of error. As AI improves and | gains trust in a field, it can then replace the human. But this | trust has to come from evidence of AI superiority over the long | term, not from companies over-selling the reliability of their | AI. | armchairhacker wrote: | This won't work because humans are lazy and fundamentally wired | to expend the least amount of effort possible. Just the belief | that you have an AI that will correct your mistakes, will make | people expend less effort (even subconsciously), until it | completely cancels out any additional error correction from the | AI. Plus, workers will hate the fact that an AI could | automatically do exactly what they are doing but they are doing | it manually for "error correction". | | It only works the opposite way, where machines and AI handle | the trivial cases and humans handle the non-trivial ones. Many | people actually genuinely like to solve hard problems which | require thinking and skill, most people strongly dislike | mundane repetitive tasks. | andreilys wrote: | So humans should do the work of image classification, voice | transcription, text summarization, etc. before an AI gets | involved? | | Makes total sense to me. | uh_uh wrote: | Humans are also less than perfect systems. Especially if they | have to deal with monotonous tasks. A human might perform | better on a 100 entries than an AI, but on 10 thousand? Of | course you can distribute the workload, but you will balloon | the costs (I'm talking about a future where GPT3 costs come | down). | | There must be a set of projects which are cost prohibited now | due to having to pay humans but will become feasible exactly | because of this tech. For a good portion of these, higher-than- | human error rate will also be tolerable or at least correctable | via a small degree of human intervention. | gbro3n wrote: | This is a good point. There is some work that just wont be | done unless it can be automated, and in that case work with a | higher rate of error is preferable to no work at all. | [deleted] | leereeves wrote: | I mostly agree. Happily, that's also what people will want as | long as human participation is necessary. We'd generally prefer | to write rather than correct an AI's writing, and prefer to | drive rather than carefully watch an AI drive. | | But when the AI is capable of something the person can't do | (like Stable Diffusion creating images compared to me) the AI | should take first chair. | gbro3n wrote: | This is a good point. Where less than perfect AI is better | than the alternative, it is useful. | greenie_beans wrote: | if you need that address parser, this is a bit more robust and | easier to use: | https://workspace.google.com/u/0/marketplace/app/parserator_... | armchairhacker wrote: | I said it before: we need Copilot flash fill. Infer what the user | wants the output to be from patterns and labels, so they can | enter a few examples and then "extend" and automatically do the | equivalent of a complex formula. e.g. Formal | | Informal Lane, Thomas | Tommy Lane Brooks, | Sarah | Sarah Brooks Yun, Christopher | Doe, | Kaitlyn | Styles, Chris | ... | | Automating something like this is extremely hard with an | algorithm and extremely easy with ML. Even better, many people | who use spreadsheets aren't very familiar with coding and | software, so they do things manually even in cases where the | formula is simple. | dwringer wrote: | I posed this exact question to character.ai's "Ask Me Anything" | bot. It decided to redo the examples, too. The results: | | > Lane, Thomas => Thomas Layne | | > Brooks, Sarah => Sarah Brooksy | | > Yun, Christopher => Chris Yun | | > Doe, Kaitlyn => KD | | > Styles, Chris => Chris Spice, Chris Chasm | | I'm sure the bot overcomplicated an otherwise simple task, but | I think there's always gonna be some creative error if we rely | on things like that. It's funny though because these results | are plausible for what a real person might come up with as | informal nicknames for their friends. | layer8 wrote: | From one of the replies: "This is awesome. I also love how 20% of | the zip codes are wrong. Messy AI future seems really fun and | chaotic." | moralestapia wrote: | >I also love how 20% of the zip codes are wrong. | | Only they aren't. Check the video again, they come out fine. | | Edit: Oh dang, you're all right, several of them have wrong | digits. :l | giarc wrote: | Row 15 includes the zip code 92105 in column A but the output | is 92101. Similar for Row 5. | hexomancer wrote: | What are you talking about? Look at 27 seconds into the | video. Many of the zip codes are wrong. | [deleted] | forgotusername6 wrote: | You need an AI that can understand when not to answer as | opposed to some best effort guessing. Some of that input didn't | have numbers in the right format so no zip code. | | The hilarious one is changing the zip code to 90210. The AI | basically accusing you of a typo because you obviously meant | that more famous zip code. | | General purpose AIs in situations where more targeted, simpler | solutions are needed are going to be incredibly dangerous. Sure | this AI can fly a plane 99.999% of the time, but every once in | a while it does a nose dive because of reasons we cannot | possibly understand or debug. | ren_engineer wrote: | yeah, most of these demos for GPT-3 are that go viral are | cherry picked at best | dylan604 wrote: | A human developer once told me that bad data is better than no | data. <facepalm> | | So of course a human developer made an AI that makes bad data. | a1369209993 wrote: | FWIW, the actual saying is that _for the purposes of | collection by enemies_ (like Facebook and Google or KGB and | NSA), the _only_ thing better than no data is bad data. | dylan604 wrote: | FWIW, I was told that well before Facebook was a thought in | the Zuck's head. Pre-socials, nobody shared data like is | done today, so the NSA had to actually work to get it. Kids | these days... /s | harrisonjackson wrote: | The author posted a follow up using a more advanced (and | expensive) gpt3 model (davinci) which does a better job of | parsing out the zip codes. It generally does a better job at | everything, but if you can get away with one of the less | expensive models then all the better. | gpderetta wrote: | I think that the function should be called DWIM instead. Amazing | feature otherwise, we really live in interesting times! | chime wrote: | DWIM: Do What I mean | | https://en.wikipedia.org/wiki/DWIM | cstross wrote: | Now wait for =deep_dream() or maybe =stable_diffusion() as a | graph-generating function! (Graphs plotted with this function | will of course zoom in infinitely but the further you go the more | eyes and shiba dogs you'll notice in the corners ...) | bee_rider wrote: | Plots that devolve into madness as if you zoom in too close? | Finally the programmers get to experience electronics class. | planetsprite wrote: | GPT3 charges for every token read/written. What may be more | useful is using GPT-3 not to manually run itself on every row, | but to take the task and generate a sufficient function that | fulfills the task. | miohtama wrote: | Like with Stable Diffusion, maybe there will be an open model | for the language prompts which less or no restrictions in the | near future. | CrypticShift wrote: | Amen. | skrebbel wrote: | The amount of 90% sensible, 10% ridiculously wrong computer | generated crap we're about to send into real humans' brains makes | my head spin. There's truly an awful AI Winter ahead and it | consists of spending a substantial amount of your best brain | cycles on figuring out whether a real person wrote that thing to | you (and it's worth figuring out what they meant in case of some | weird wording) or it was a computer generated fucking thank you | note. | [deleted] | roody15 wrote: | Think the winter is here | dzink wrote: | You got it! After seeing a few tweet storms and articles that | turn out to be GPT3 gibberish, I end up coming to HN more for | my news because usually someone flags waste of time in the | comments. | | The software would save people 80% or the work and most are | lazy enough to release it as is, instead of fixing the | remaining 20%. That laziness will end up forcing legislation to | flag and eventually ban or deprioritize all GPT content, which | will result in a war of adversarial behaviors trying to hide | generated stuff among real. Can't have nice things! | andreilys wrote: | How would you go about classifying something as GPT | generated? | | Let alone flagging/deprioritizing it via some draconian | legislation? | bondarchuk wrote: | By the fact that it was generated using GPT. Same way you | would go about classifying something as e.g. not made with | slave labour or made with a production process that follows | environmental pollution rules. That you can't easily detect | it from the end product is not necessarily an obstruction | to legislation. | mhh__ wrote: | Probably closer to a Butlerian Jihad than a AI winter as per | se, assuming something dramatic does happen | PontifexMinimus wrote: | > The amount of 90% sensible, 10% ridiculously wrong computer | generated crap we're about to send | | Agreed. Sooner or later a company is going to do this with its | customers, in ways that are fine 95% of the time but cause | outrage or even harm on outliers. | | And if that company is anyone like Google, it'll be almost | impossible for the customers to speak to a human to rectify | things. | szundi wrote: | And the funniest is that actual people may be worse, but | still it is freaking me out to be moderated by an AI. | | Also when this is normal and ubiquitous come people who are | playing it and AI will be just dumb to recognise, the real | humans all fired, game over, stuck at shitty systems and | everyone goes crazy. | jedberg wrote: | It depends on how people use the tools. For example the thank | you note one -- if someone just prints off the output of this | and sends it, yeah, that's bad. | | But if someone uses this to do 90% of the work and then just | edits it to make it personal and sound like themselves, then | it's just a great time saving tool. | | I mean, in this exact example, 70 years ago you'd have to hand | address each thank you card by hand from scratch. 10 years ago | you could use a spreadsheet just like this to automatically | print off mailing labels from your address list. It didn't make | things worse, just different. | | This is just the next step in automation. | agf wrote: | > But if someone uses this to do 90% of the work and then | just edits it to make it personal and sound like themselves, | then it's just a great time saving tool. | | This is still way too optimistic. Reading through something | that's "almost right", seeing the errors when you already | basically know what it says / what it's meant to say, and | fixing them, is hard. People won't do it well, and so even in | this scenario we often end up with something much worse than | if it was just written directly. | | There is a lot of evidence for this, from the generally low | quality of lightly-edited speech-to-text material, to how | hard it is to look at a bunch of code and find all of the | bugs without any extra computer-generated information, to how | hard editing text for readability can be without serious | restructuring. | jonas21 wrote: | Gmail's autocomplete already works great for this, and it | will only get better over time. The key is to have a human | in the loop to decide whether to accept/edit on a phrase by | phrase or sentence by sentence basis. | mwigdahl wrote: | Just train another AI model to do it then! I'm not joking | -- Stable Diffusion generates some pretty grotesque and low | quality faces, but there are add-on models that can | identify and greatly improve the faces as part of the | processing pipeline. | | Doesn't seem like a stretch to have similar mini-models to | improve known deficiencies in larger general models in the | textual space. | rdiddly wrote: | I would classify that act of editing as "completing the | remaining 10% of the work." Somebody has to do it, whether | you're doing it from the writing side as in your example, or | making the reader do it from their side, as in my grandparent | comment's example. But it's usually the last 10% of anything | that's the hardest, so if someone abdicates that to a machine | and signs their name to it (claiming they said it, and taking | responsibility for it) they're kind of an asshole, in both | the schlemiel and the schlemozel senses of the word. | | I could extrapolate in my extremely judgmental way that the | person who does that probably has a grandiose sense of how | valuable their own time is, first of all, and secondly an | impractical and sheepishly obedient devotion to big weddings | with guest-lists longer than the list of people they actually | give a shit about. Increase efficiency in your life further | upstream, by inviting fewer people! (Yeah right, might as | well tell them to save money by shopping less and taking | fewer trips. _Like that would ever work!_ ) | | But I digress, and anyway don't take any of that too | seriously, as 20 years ago I was saying the same kinds of | things about mobile phones... like "Who do you think you are, | a _surgeon_ , with that phone?" Notice it's inherently a | scarcity-based viewpoint, based on the previous _however- | many_ years when mobile phones really were the province only | of doctors and the like. Now they 're everywhere... So, | bottom line, I think the thank-you notes are a lousy use of | the tech, but just like the trivial discretionary | conversations I hear people having on their mobile phones now | that they're ubiquitous, this WILL be used for thank-you | notes! | Domenic_S wrote: | Mail merge has existed since the 1980s. | https://en.wikipedia.org/wiki/Mail_merge | brookst wrote: | Maybe? Is it really going to be all that different from the | past thousand years where we've had 90% sensible, 10% | ridiculously wrong[0] human-generated crap? | | [0] https://ncse.ngo/americans-scientific-knowledge-and- | beliefs-... | whiddershins wrote: | What happens if you try to ask GPT-3 whether something was | written by GPT-3? | zikduruqe wrote: | Just listen here - https://infiniteconversation.com/ | johnfn wrote: | Idea: Use GPT-3 to identify GPT-3-generated snippets. | Eavolution wrote: | Is that not sort of what a GAN is? | roflyear wrote: | Then we need to train a model that gives us an idea of how | accurate each prediction is | ASalazarMX wrote: | This, but unironically. It could be used to further improve | the snippets that weren't identified. | pfarrell wrote: | > Use GPT-3 to identify GPT-3-generated snippets. | | I just lost the game. | lucasmullens wrote: | In some cases it would be impossible, since sometimes it can | output exactly what was written by a human, or something that | sounds 100% like what someone would write. | | But if you allow some false negatives, such as trying to | detect if a bot is a bot, I _think_ that could work? But I | feel like the technology to write fake text is inevitably | going to outpace the ability to detect it. | thatguymike wrote: | 51s -- "We are truly grateful", says the heartfelt thankyou card | that was written by an algorithm in a spreadsheet. | semi-extrinsic wrote: | Oh, this is nothing new. | | I remember in like 2007 or something, in the early days of | Facebook, someone made a CLI interface to the FB API. And I | wrote a random-timed daily cron job that ran a Bash script that | checked "which of my FB friends have their birthday today", | went through that list, selected a random greeting from like 15 | different ones I'd put into an array, and posted this to the | wall of person $i. Complete with a "blacklist" with names of | close friends and family, where the script instead sent me an | email reminder to write a manual, genuine post. | | I used to have a golfed version of that script as my Slashdot | signature. | [deleted] | bee_rider wrote: | If people want to put this sort of language in a thank-you | note, I guess... I dunno, it always comes off as inauthentic to | me, so I don't really care if I got mass produced or artisanal | hand-crafted lies. | CobrastanJorji wrote: | The problem's worse than you know. I've heard that sometimes | real humans who tell a lot of people "thank you" aren't even | that thankful. ___________________________________________________________________ (page generated 2022-11-02 23:00 UTC)