[HN Gopher] The Sell [?] [?] as [?] [?] Scam
       ___________________________________________________________________
        
       The Sell [?] [?] as [?] [?] Scam
        
       Author : jmount
       Score  : 113 points
       Date   : 2023-04-22 21:02 UTC (1 hours ago)
        
 (HTM) web link (win-vector.com)
 (TXT) w3m dump (win-vector.com)
        
       | TZubiri wrote:
       | Are compilers a scam as well?
       | 
       | There exists a program every problem you have, you just have to
       | find the code.
        
         | tialaramex wrote:
         | Crucially there is _not_ a program for every problem. Many
         | (presumably  "Almost all" in a mathematical sense) problems are
         | Undecidable and so a program can't do that.
        
           | alextheparrot wrote:
           | Why are we not OK with the program producing the
           | undecidability result?
        
       | bflesch wrote:
       | This is a clever argument
        
         | simondotau wrote:
         | It is a superficially clever argument. It's not actually a
         | clever argument because it elides the existence of "but easier"
         | or "but faster" as mechanisms for valid business models.
        
         | smitty1e wrote:
         | It would be improved by a bit of additional editing in The
         | Famous Article.
        
       | rgbrenner wrote:
       | Wait, how does this scam work for OpenAI? The product is free to
       | use.
       | 
       | Also they haven't claimed GPT is an AGI or that it can solve all
       | your problems.
        
       | 1023bytes wrote:
       | So as long as it doesn't automagically solve any given problem
       | it's a scam?
        
       | LastTrain wrote:
       | And convince the user to validate the solutions to other user's
       | problems while they're at it.
        
       | adamnemecek wrote:
       | This argument is disingenuous. Hyperparameter optimization is not
       | in the same category as prompt engineering, like at all.
       | 
       | Also no one claims like half of the things the article claims
       | people clain.
        
         | version_five wrote:
         | He, imo correctly, puts them both in the category of extra
         | degrees of freedom that can allow the user to overfit and get
         | results that appear more impressive than the underlying reality
         | about how the model has generalized.
        
           | adamnemecek wrote:
           | All systems are like that.
        
             | quickthrower2 wrote:
             | That is quite an insight. A "programmer" is really an
             | "overfitter".
        
               | version_five wrote:
               | That's not an insight, it's a misunderstanding.
               | Overfitting is only applicable relative to claims of
               | statistical performance.
               | 
               | And in any event, there are lots of systems with fewer
               | degrees of freedom (or in the case of deep learning, more
               | generalization potential) than the training data, that
               | are not at particular risk of being overfit, and there
               | are measures and tests to mitigate the risk of
               | overfitting. It's not some inherent characteristic of
               | "systems".
        
           | jameshart wrote:
           | But generating new prompts for GPT is incredibly cheap! You
           | just rephrase and ask again.
           | 
           | That there are prompts which generate impressive results with
           | GPT is the _point_. Because anyone can generate prompts - and
           | get impressive results.
           | 
           | Whereas hyper parameter tuning is _expensive_. A system that
           | generates good results with the right tuning doesn't tell you
           | much about your ability to use it to generate good results,
           | because it will be hard to try lots of different tuning
           | approaches to discover if you can get a useful result.
           | 
           | These things seem not remotely comparable.
        
             | akkartik wrote:
             | Think about when you use GPT to generate prompts, something
             | that seems to be growing common. Once you have a pipeline
             | like that, changing the (meta)prompt can be expensive. You
             | designed the pipeline thinking you only have to come up
             | with the meta-prompt once, and it would work from now on.
             | But you find you have to keep tuning your
             | Kubernetes/upgrading your dependencies/tweaking your meta
             | prompt.
             | 
             | "Scam" is overblown, but I think OP is right to warn of a
             | possible future issue. It's an issue endemic to all
             | software, so not something worth calling out recent AI
             | advances in particular for. But it seems something we
             | should all be trying to get better about. Monitor the
             | ongoing costs of the systems we create. Do they really pay
             | for themselves or are we waiting for a Godot that never
             | arrives?
        
           | ChainOfFools wrote:
           | Prompt engineering rather resembles the mentalists' trick of
           | cold reading their subject, turned inside out.
        
       | sebzim4500 wrote:
       | >To conclude: one must have different standards for developing
       | systems than for testing, deploying, or using systems. Or:
       | testing on your training data is a common way to cheat, but so is
       | training on your test data.
       | 
       | Isn't this already a solved problem? Every reasonable paper on ML
       | separates their test data from their validation data already.
        
         | version_five wrote:
         | That in no way prevents overfitting though hyperparameter
         | optimization / graduate student descent. All the common
         | benchmarks, by definition of being a common benchmark, are
         | susceptible to overfitting
        
           | sebzim4500 wrote:
           | The idea is that you do all your hyperparameter optimization
           | with the test data and then only run through the validation
           | data once before you submit your paper.
        
       | version_five wrote:
       | It's a good point, I'd consider it (overfitting) a pitfall or
       | common mistake in ML rather than the only mode. I'd agree that
       | most ML models and almost all state-of-the-art are over-fit to
       | the point of being useless, but that's not an inevitability.
        
       | simondotau wrote:
       | > _Convince the user that it is their job to find a instantiation
       | or setting of this control to make the system work for their
       | tasks._
       | 
       | As opposed to convincing the user that it is their job to brief a
       | suitably qualified contractor or employee to make a company
       | perform the required work?
       | 
       | [?] humans are a scam
        
       | precompute wrote:
       | It is difficult to conclude this is true, and it will be even
       | more difficult in the future because tech like "AI" has the
       | ability to almost completely saturate the amount of data any
       | human can ingest. The relationship will soon be symbiotic, with
       | everything showing the extent of our progress... in the same
       | manner we can date movies by the kinds of phones they use. Many
       | plots will become stale, many worldviews will be condensed to
       | something that supports "AI" and the offending branches will be
       | snipped. The only way to really forget this limitation is to be
       | myopic enough to disregard everything else. With the way the
       | internet is going, I'm sure one day these LLMs will be heralded
       | as "free" and "open" media, their fuzzy recollections the only
       | records we will have of the past, and extensive use will
       | essentially morph civilization in their own image.
        
       | bitL wrote:
       | This has always been present in subfields of AI. For example, in
       | classical computer vision, one had to figure out specific
       | parameters working just for a single image or video scene by
       | hand. Machine learning can in theory at least make these
       | parameters learnable at the cost of complexity.
        
       | dvt wrote:
       | I can walk and chew bubble gum at the same time: on one hand,
       | yes, there's certainly a lot of Kool-Aid being drank by the AI
       | folks. Even on HN, I constantly argue with people that genuinely
       | think LLMs are some kind of magical black box that contain
       | "knowledge" or "intelligence" or "meaning" when in reality, it's
       | just a very fancy Markov chain. And on the other hand, I think
       | that language interfaces are probably the next big leap in how we
       | interact with our computers, but more to the point of the
       | article:
       | 
       | > To conclude: one must have different standards for developing
       | systems than for testing, deploying, or using systems.
       | 
       | In my opinion, you unfortunately will never (and, in fact _could_
       | never) have reliable development and testing standards when
       | designing purely stochastic systems like, e.g., large language
       | models. Intuitively, the fact that these are stochastic systems
       | is _why_ we need things like hyper-parameters, fiddling with
       | seeds, and prompt engineering.
        
         | sebzim4500 wrote:
         | >it's just a very fancy Markov chain
         | 
         | Could you provide an argument for why an LLM is a fancy markov
         | chain that does not apply equally well to a human?
        
           | dvt wrote:
           | Well, for one, humans are obviously at least _more_ than a
           | fancy Markov chain because we have genetically hard-wired
           | instincts, so we are, in some sense,  "hard-coded" if you
           | forgive my programming metaphor. Hard-coded to breed,
           | multiply, care for our young, seek shelter, among many other
           | things.
        
         | charcircuit wrote:
         | >contain "knowledge" or "intelligence" or "meaning" when in
         | reality, it's just a very fancy Markov chain
         | 
         | These are not mutually exclusive. If you have a Markov chain
         | that 100% of the time outputs "A cat is an animal", then it has
         | knowledge that a cat is an animal.
        
           | version_five wrote:
           | $ yes "a cat is an animal"
           | 
           | Does `yes` also have "knowledge"?
        
             | zacgarby wrote:
             | no, but `$ yes "a cat is an animal"` does (if you know how
             | to interpret it)
        
           | tifik wrote:
           | I think you are operating with a different definition of
           | "knowledge" that the parent does.
        
             | stametseater wrote:
             | Knowledge is awareness of information. "Awareness" is a
             | quagmire because a lot of people believe that 'true'
             | awareness requires possessing a sort of soul which machines
             | can't possess.
             | 
             | I think the important part is information, the matter of
             | 'awareness' can simply be ignored as a
             | philosophical/religious disagreement which will never be
             | resolved. What's important is: Does the system contain
             | information? Can it reliably convey that information? In
             | which ways can it manipulate that information?
        
             | archgoon wrote:
             | [dead]
        
             | WastingMyTime89 wrote:
             | The heart of the issue is always the same: is a perfect
             | simulation actually the same thing as what it simulates. I
             | would argue that yes especially when the definition of
             | _knowledge_ or _intelligence_ is already so fuzzy but some
             | people will probably always disagree.
        
           | dsr_ wrote:
           | It does not. It has a language output that you find pleasing.
        
         | cjbprime wrote:
         | The Microsoft Research "Sparks of AGI" paper spends 154 pages
         | describing behaviors of GPT-4 that are inconsistent with the
         | understanding of it being a "fancy Markov chain":
         | https://arxiv.org/abs/2303.12712
         | 
         | I expect that the reason people are constantly arguing with you
         | is that your analysis does not explain some easily testable
         | experiences, such as why GPT-4 has the ability to explain what
         | some non-trivial and unique Python programs would output if
         | they were run, despite GPT-4 not having access to a Python
         | interpreter itself.
        
           | dvt wrote:
           | > trivial and unique Python programs would output if they
           | were run, despite GPT-4 not having access to a Python
           | interpreter itself
           | 
           | Trivially explained as "even a broken clock is right twice a
           | day." I skimmed the paper, as it was linked here on HN iirc.
           | First, it was published by Microsoft, a company that
           | absolutely has a horse in this race (what were they supposed
           | to say? "The AI bot our search engine uses is dumb?"). Second
           | of all, I was very interested in their methodology, so I
           | fully read the first section, which is woefully hand-wavy, a
           | fact with which even the authors would agree:
           | 
           | > We acknowledge that this approach is somewhat subjective
           | and informal, and that it may not satisfy the rigorous
           | standards of scientific evaluation.
           | 
           | The paper, for instance, is amazed that GPT knows how to draw
           | a unicorn in TikZ, but we already know it was trained on the
           | Pile, which includes all Stack Exchange websites, which
           | happens to include answers like this one[1]. So to make the
           | argument that it's being creative, when the answer (or, more
           | charitably, something _extremely_ close to it) is literally
           | in the training set, is just disingenuous.
           | 
           | [1] https://tex.stackexchange.com/questions/312199/i-need-a-
           | tex-...
        
       | badloginagain wrote:
       | If I understand correctly, the meat of the argument is "that is a
       | system for every ([?]) task, there exists ([?]) a setting that
       | gives the correct answer for that one task."
       | 
       | My understanding of this (correct me if I'm wrong) is that the
       | scam is convincing users that GPT-X can do anything with say, the
       | correct prompts.
       | 
       | This argument misses the mark for me. It's not that it solves all
       | the problems, it's that the problems it does solve is
       | economically impactful. Significantly economically impactful in
       | some cases- obvious examples of call centers and first-line
       | customer support.
        
         | debaserab2 wrote:
         | > Significantly economically impactful in some cases- obvious
         | examples of call centers and first-line customer support.
         | 
         | Is it that obvious?
         | 
         | Yesterday I had a trivial but uncommon issue with my pharmacy.
         | I reached out to them online - their chatbot was the only
         | channel available. I tried, over the course of 20 minutes and 3
         | restarted sessions, to communicate an issue that a human would
         | have been able to respond to in 30 seconds. Eventually I just
         | gave up and got the prescription filled elsewhere.
         | 
         | No doubt this pharmacy saved money by cutting support staff. I
         | just think it's easy to see these solutions and cost savings
         | without bothering to look at how much of a frustrating
         | experience it can be for a customer.
        
           | [deleted]
        
           | tanseydavid wrote:
           | Do you have any reason to believe that the Chatbot was GPT3.5
           | or GPT4 based?
        
             | hartator wrote:
             | How are you gonna answer things like pricing? Issue with
             | pharmacies is the super complicated and super secretive
             | pricing structure. A good UI can solve this if they want to
             | drop the secrecy.
        
             | Waterluvian wrote:
             | I'd be surprised if something like a pharmacy managed to
             | adopt a tech that quickly. In my experience non-tech
             | industries often take quite a while to adopt.
        
           | bombcar wrote:
           | It is very easy to measure costs associated with a customer.
           | 
           | It is nearly impossible to measure the customers lost.
           | 
           | And you may never return, which they'll never know.
        
         | mitchellh wrote:
         | Since my blog post is linked, I wanted to clarify something.
         | While this appears to be the broad message, I don't think the
         | author intended to imply this about my post specifically, but I
         | still feel the need that I point out the following in my prompt
         | eng blog post (linked by the OP)[1]:
         | 
         | > To start, you must have a problem you're trying to build a
         | solution for. The problem can be used to assess whether
         | prompting is the best solution or if alternative approaches
         | exist that may be better suited as a solution. Engineering
         | starts with not using a method for the method's sake, but
         | driven by the belief that it is the right method.
         | 
         | [1]: https://mitchellh.com/writing/prompt-engineering-vs-blind-
         | pr...
        
         | cglong wrote:
         | I ended up, ironically, asking ChatGPT for a summary of the
         | article. That's the argument it derived too.
        
         | jameshart wrote:
         | Right
         | 
         | #1 it's not clear that this [?] [?] construction is a fair
         | representation of what is being 'sold' by GPT-x
         | 
         | #2 it's also not clear what this proposed _inverted_
         | formulation ([?] [?]) that describes what the author thinks GPT
         | actually _is_ even means. For every setting there exists a task
         | that it answers? Does that even make sense?
        
           | justeleblanc wrote:
           | Pretty sure you should read "for every task there exists a
           | setting".
        
             | jameshart wrote:
             | But what's the inverse?
        
       | ummonk wrote:
       | The author is saying ! [?] [?] and selling it as ! [?] [?]. He's
       | the one scamming.
       | 
       | AI researchers are rather upfront about the need to use fine-
       | tuning and prompt engineering for current AI models.
       | 
       | As to random forest models, wasn't the whole point of
       | hyperparameter sweeps to remove the need for manual
       | hyperparameter selection?
        
       | scrame wrote:
       | > Also, as they the hyper-parameter selection fully to the users,
       | they become not falsifiable. If they didn't work, it is because
       | you didn't pick the right hyper-parameters or training
       | procedures.
       | 
       | See Also: Scrum certification.
        
       | [deleted]
        
       | _Microft wrote:
       | Since there are certainly people unfamiliar with the symbols,
       | here is the Wikipedia article on them:
       | 
       | https://en.wikipedia.org/wiki/Quantifier_(logic)
        
         | nirvdrum wrote:
         | The article provides abbreviated definitions that might be less
         | confusing to follow than the comprehensive Wikipedia page. But,
         | I agree that calling them out a bit more clearly in the article
         | would have been helpful.
        
         | LegitShady wrote:
         | I just want to know what they're called if someone were to say
         | them out loud. Upside down A and backwards E probably aren't
         | accurate.
        
           | _Microft wrote:
           | I understand this as question how these symbols are
           | pronounced (the names are in the article if you are curious):
           | actually "for all/every <...>" and "there is/exists (at least
           | one) <...>".
           | 
           | Example: [?] x[?]R\\{0} [?]y[?]R : x*y=2
           | 
           | "For all values x from the real numbers excluding zero, there
           | is a value y from the real numbers so that x*y equals 2."
        
       | alanbernstein wrote:
       | My biggest takeaway from the article is my new favorite word:
       | cryptomorphic, meaning equivalent, but not obviously so.
        
         | hanoz wrote:
         | I'm no linguist but it seems to me that that word doesn't
         | really work for that definition. It sounds like it should
         | pertain to hidden form, not hidden _similar_ form, like say,
         | _cryptoisomorphic_.
        
         | _royadar wrote:
         | Thanks for pointing that out, I'll add it to the list of words
         | I'm supposed to use to sound smart.
         | 
         | However, I did find the main argument compelling, through my
         | own waste of time (yes I learned it the hard way) I've come to
         | acknowledge that the tradeoff for prompting GPT in order to get
         | a valuable answer, is just not worth it.
         | 
         | However, it does seem that most of the people are intrigued by
         | "it can answer anything with the right prompt" promise, and
         | they devote a lot of time in order to fulfill it.
         | 
         | You cannot trust GPT, you cannot rely on it and it can't
         | replace anyone, until it learns to prompt itself.
         | 
         | I think what this article really implies is that - we the
         | humans are doing the quality assurance for the GPT answers, if
         | we take that out of the equation, and we don't give it quality
         | prompts, etc.
         | 
         | It's worth nothing.
        
       | mmaunder wrote:
       | "Or: testing on your training data is a common way to cheat, but
       | so is training on your test data"
       | 
       | The need to hold back training data for testing, and issues
       | around testing using variants of training data versus real world
       | data are well known.
        
       | primax wrote:
       | [flagged]
        
         | version_five wrote:
         | This isn't reddit, "butthurt" and similar low brow shit turns
         | this into the same kind of cesspool. Make an adult argument or
         | don't make one.
        
           | LastTrain wrote:
           | ...and it didn't even make sense in the context he was using
           | it anyway.
        
           | savanaly wrote:
           | He is making a substantive point, but you are rejecting him
           | out of hand due to terminology which you feel signals he's
           | outside a clique. Who is the one without an adult argument?
        
       | afro88 wrote:
       | The author says that the builders of these systems are also the
       | ones that run the scam. And they deliberately build it in a
       | fashion that enables the scam. Seriously?
       | 
       | This is a gripe with sales and marketing. A tale as old as time.
        
       | coding123 wrote:
       | This is like a horrible way to describe something.
        
       | analog31 wrote:
       | >>> Build a system that solves problems, but with an important
       | user-facing control. ...
       | 
       | >>> Convince the user that it is their job to find a
       | instantiation or setting of this control to make the system work
       | for their tasks.
       | 
       | By golly, you just described playing the cello.
        
       | PaulHoule wrote:
       | ChatGPT itself is _in on the scam_. The way i think about it is
       | that ChatGPT is already superhuman at bullshitting, many people
       | want to give it credit for being more capable than it really is.
       | 
       | it is interesting to postulate if it is the "most likely word"
       | heuristic that leads to this behavior (e.g. never says anything
       | that startles people) or HFRL training systematically teaching it
       | to say what people want to hear.
        
         | PopePompus wrote:
         | GPT-4 is very useful to me right now, for small programming
         | projects. Though it can't write an entire program for a
         | nontrivial project, it is good at the things I hate doing, like
         | figuring out regular expressions and SQL commands. I smile
         | broadly at the fact that I may never have to write either of
         | those things again. And GPT-4 knows of the existence of
         | countless software libraries and modules that I've never heard
         | of. It doesn't always use them correctly, but just alerting me
         | to their existence is tremendously helpful. It can usually
         | answer questions about APIs correctly. I have no idea what
         | impact LLMs will have on the world as a whole, but they will
         | clearly revolutionize coding.
        
       ___________________________________________________________________
       (page generated 2023-04-22 23:00 UTC)