[HN Gopher] Peter Norvig critically reviews AlphaCode's code qua...
       ___________________________________________________________________
        
       Peter Norvig critically reviews AlphaCode's code quality
        
       Author : wrycoder
       Score  : 176 points
       Date   : 2022-12-16 20:38 UTC (2 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | bombcar wrote:
       | _The marvel is not that the bear dances well, but that the bear
       | dances at all._
       | 
       | The surprising thing is that it can make code that works -
       | however, given that code can be _tested_ in ways that  "art" and
       | "text" cannot (yet), perhaps it's not that strange.
        
       | dekhn wrote:
       | I read this as a very well written feature request to the
       | AlphaCode engineers (or anybody working on this problem).
       | 
       | I really like Peter's writing style. It's fairly clear, and
       | understating, while also making it quite clear there are areas
       | for improvement in reach. For those who haven't read it, Peter
       | also wrote this gem: https://norvig.com/chomsky.html which is an
       | earlier comment about natural language processing, and
       | https://static.googleusercontent.com/media/research.google.c...
       | which is a play on Wigner's "Unreasonable Effectiveness of
       | Mathematics in the Natural Sciences".
        
       | MoSattler wrote:
       | I wasn't aware AI can already take plain English text and create
       | functioning software.
       | 
       | I guess it's time to look for another profession.
        
       | tareqak wrote:
       | When I saw the test suite that Peter Norvig created for the
       | program, I immediately thought to myself "what if there was a LLM
       | program that knew how to generate test cases for arbitrary
       | functions?"
       | 
       | I think a tool like that even in an early incomplete and
       | imperfect form could help out a lot of people. The first version
       | could take all available test cases as training data. The second
       | one could instead have a curated list of test cases that pass
       | some bar.
       | 
       | Update: I thought of a second idea also based on Peter Norvig's
       | observation: what about an LLM program that adds documentation /
       | comments to the code without changing the code itself? I know
       | that it is a lot easier for me to proofread writing that I have
       | not seen before, so it would help me. Maybe a version would
       | simply allow for selecting which blocks of code need commenting
       | based on lines selected in an IDE?
        
         | Buttons840 wrote:
         | How about the other way. I define a few test cases and the AI
         | writes code for a generalized solution. Not just code that
         | regurgitates the test cases, but that generalizes well to
         | unseen cases. You'll notice this is simply the machine learning
         | problem restated.
         | 
         | The next step could be to have the AI write code that describes
         | its own reasoning, balancing length of code and precision.
        
           | spawarotti wrote:
           | > I define a few test cases and the AI writes code for a
           | generalized solution
           | 
           | How about the AI never writing any code, just training "mini
           | AI" / network that implements the test cases, of course in a
           | generalized way, the way our current AI systems work. We
           | could continue adding test cases for corner cases until the
           | "mini AI" is so good that we no longer can come up with a
           | test case that trips it over.
           | 
           | In such future, the skill of being comprehensive tester would
           | be everything, and the only code written by humans would be
           | the test cases.
        
         | kubb wrote:
         | That's potentially more helpful than writing the code itself.
         | Writing unit tests can take most of the development time.
        
           | mrguyorama wrote:
           | And literally throwing random half junk unit tests at your
           | code will better test it than you writing unit tests that are
           | blind to the problems it might have because you wrote both
           | and both bits of code have the same blind spots.
           | 
           | We should probably be developing systems that fuzz all code
           | by default.
        
       | happyopossum wrote:
       | > I find it problematic that AlphaCode dredges up relevant code
       | fragments from its training data, without fully understanding the
       | reasoning for the fragments.
       | 
       | As a non-programmer who has to 'code' occasionally, this is
       | literally what I do, but it takes me hours or days to hammer out
       | a few hundred lines of crap python. Using a generative model or
       | llm that can write equally crappy scripts in seconds feels like a
       | HUUUGE win for my use cases.
        
         | peteradio wrote:
         | A lazy ineffective person is preferable over a prodigious
         | idiot.
        
       | lisper wrote:
       | Is it just me, or is that problem description completely
       | incoherent?
        
         | drexlspivey wrote:
         | Problem: You open a terminal and type the string 'ababa' but
         | you are free to replace any button presses with backspace. Is
         | there a combination where the terminal reads `ba` at the end?
        
           | lisper wrote:
           | Thanks, your version makes a lot more sense.
        
           | hoten wrote:
           | If the AI could do this simplification just as you did, I'd
           | find that far more exciting!
        
         | krackers wrote:
         | It took me way too long to understand it as well. And the fact
         | that you press backspace _instead_ of a character, instead of
         | allowing backspace to be pressed at any time (which would turn
         | it into checking if B is a subsequence of A I believe).
        
         | zug_zug wrote:
         | Good god I'm not alone. For me it's fascinating that an AI can
         | make sense of that garble of words. I spent 4 minutes trying to
         | read it and gave up.
        
       | [deleted]
        
       | aidenn0 wrote:
       | The Minerva geometry answer looks like something one of my kids
       | would have written: guess the answer then write a bunch of mathy-
       | sounding gobbledygook as the "reasoning."
       | 
       | Also, that answer would have gotten 4/5 points at the local high-
       | school.
        
       | fergal_reid wrote:
       | Huge respect for Norvig, but I think this is a shallow analysis.
       | 
       | For example, I just took Norvig's 'backspacer alpha' function and
       | asked ChatGPT about it. It gave me an ok English language
       | description. It names the variables more descriptively on
       | command.
       | 
       | I'm sure it'll hallucinate and make errors, but I think we're all
       | still learning about how to get the capabilities we want out of
       | these models. I wouldn't rush to judgement about what they can
       | and can't do based on what they did; shallow analysis can mislead
       | both optimistically and pessimistically at the moment!
        
       | gok wrote:
       | _They are vulnerable to reproducing poor quality training data_
       | 
       |  _They are good locally, but can have trouble keeping the focus
       | all the way through a problem_
       | 
       |  _They can hallucinate incorrect statements_
       | 
       |  _does not generate documentation or tests that would build trust
       | in the code_
       | 
       | These observations are about human programmers, right?
        
       | chubot wrote:
       | Somewhat dumb question: I wonder what tool he used for the red
       | font code annotations and arrows? What tool would you use, like
       | Photoshop or something? And just screenshot the code from some
       | editor or I guess Jupyter?
        
         | circuit wrote:
         | Most likely Preview.app's built-in annotation tools
        
       | neilv wrote:
       | > They need to be trained to provide trust. The AlphaCode model
       | generates code, but does not generate documentation or tests that
       | would build trust in the code.
       | 
       | I don't understand how this would build trust.
       | 
       | If they generate test cases, you have to validate the test cases.
       | 
       | If they generate documentation, you have to validate the
       | documentation.
       | 
       | For a one-shot drop of code from an unknown party, test cases and
       | docs have been signals that the writer know that's a thing, and
       | they at least put effort into typing it. So maybe we assume more
       | likely that they also used good practices with the code.
       | 
       | But that's signalling to build trust, and adding those to build
       | trust without addressing the reasons we _shouldn 't_ have trust
       | in the code (as this article points out) seems like it would be
       | building _misplaced_ trust.
       | 
       | (Though there is some benefit to doc for validation, due to the
       | idea behind the old saying "if your code and documentation
       | disagree, then both are probably wrong".)
        
       | RodgerTheGreat wrote:
       | I think the notes at the end bury the lede; in particular:
       | 
       | > "I save the most time by just observing that a problem is an
       | adaptation of a common problem. For a problem like 2016 day 10,
       | it's just topological sort." This suggests that the contest
       | problems have a bias towards retrieving an existing solution (and
       | adapting it) rather than synthesizing a new solution.
        
         | mrguyorama wrote:
         | The fact is, the vast majority of programming IS just dredging
         | up a best solution and modifying it to meet your specifics.
         | Some of the best and still most current algorithms are from
         | like the 60s.
         | 
         | That doesn't make neural networks "smart", and instead says
         | more about our profession and how terrible we in general are at
         | it.
        
       | Octokat wrote:
       | His repo is a gold mine
        
       | trynewideas wrote:
       | This is a great review but it still misses what seems like the
       | point to me: these models don't do any actual reasoning. They're
       | doing the same thing that DALL-E _etc._ does with images: using a
       | superhuman store of potential outcomes to mimic an outcome that
       | the person entering the prompt would then click a thumbs-up icon
       | on in a training model.
       | 
       | Asking why the model doesn't explain how the code it generated
       | works is like asking a child who just said their first curse word
       | what it means. The model and child alike don't know or care, they
       | just know how people react to it.
        
         | jujugoboom wrote:
         | Stochastic Parrot is the term you're looking for
         | https://dl.acm.org/doi/10.1145/3442188.3445922
        
         | fossuser wrote:
         | What is this then:
         | https://twitter.com/jbrukh/status/1603868836729610250?s=46&t...
         | 
         | That looks a lot like reasoning to me. At some point these
         | disputing definitions arguments don't matter. Some people will
         | endlessly debate whether other people are conscious or
         | "zombies" but it's not particularly useful.
         | 
         | This isn't yet AGI, but the progress we're seeing doesn't look
         | like failure to me. It looks like what I'd predict to see
         | before AGI exists.
        
         | dekhn wrote:
         | Norvig discusses this topic in detail in
         | https://norvig.com/chomsky.html As you can see, he has a
         | measured and empirical approach to the topic. If I had to
         | guess, I think he suspects that we will see an emergent
         | reasoning property once models obtain enough training data and
         | algorithmic complexity/functionality, and is happy to help
         | guide the current developers of ML in the directions he thinks
         | are promising.
         | 
         | (this is true for many people who work in ML towards the goal
         | of AGI: given what we've seen over the past few decades, but
         | especially in the past few years, it seems reasonable to
         | speculate that we will be able to make agents that demonstrate
         | what appears to be AGI, without actually knowing if they posses
         | qualia, or thought processes similar to those that humans
         | subjectively experience)
        
           | trynewideas wrote:
           | That's a great link and read, thanks for that.
           | 
           | While I do think models are, can be, and likely must be a
           | useful _component_ of a system capable of AGI, I don 't seem
           | to share the optimism (of Norvig or a lot of the
           | GPT/AlphaCode/Diffusion audience) that models _alone_ have a
           | high-enough ceiling to approach or reach full AGI, even if
           | they fully conquer language.
           | 
           | It'll still fundamentally _only_ be modeling behavior, which
           | - to paraphrase that piece - misses the point about what
           | general intelligence is and how it works.
        
           | r_hoods_ghost wrote:
           | I suspect that a lot of AI researchers will end up holding
           | the exact opposite position to a lot of philosophers of mind
           | and treat AGIs as philosophical zombies, even if they behave
           | as if they are conscious. The more thoughtful ones will
           | hopefully leave the door open to the possibility that they
           | might be conscious beings with subjective experiences
           | equivalent to their own, and treat them as such, because if
           | they are then the moral implications of not doing so are
           | disturbing.
        
             | oliveshell wrote:
             | I'm happy to "leave the door open," i.e., I'd love to be
             | shown evidence to the contrary, but:
             | 
             | If the entity doing the cognition didn't evolve said
             | cognition to navigate a threat-filled world in a vulnerable
             | body, then I have no reason at all to suspect that its
             | experience is anything like my own.
             | 
             | edit: JavaJosh fleshed this idea out a bit more. I'm not
             | sure if putting ChatGPT into a body would help, but my
             | intuitive sympathies in this field are in the direction of
             | embodied cognition [1], to be sure.
             | 
             | [1] https://en.wikipedia.org/wiki/Embodied_cognition
        
             | javajosh wrote:
             | Modern AI software lacks a body, exempting it from a wide
             | variety of suffering. But also of any notion of selfhood
             | that we might share. If modern software said "Help, I'm
             | suffering" we'd rightly be skeptical of the claim. Unless
             | suffering is an emergent property (dubious) then the
             | statement is, at best, a simulation of suffering and at
             | worst noise or a lie.
             | 
             | That said, things change once you get a body. If you put
             | ChatGPT into a simulated body in a simulated world, and
             | allowed it to move and act, perhaps giving it a motivation,
             | then the combination of ChatGPT and the state of that body,
             | would become something very close to a "self", that might
             | even qualify for personhood. It is scary, by the way, that
             | such a weighty decision be left to us, mere mortals. It
             | seems to me that we should err on the side of granting too
             | much personhood rather than too little, since the cost of
             | treating an object like a person is far less than treating
             | a person like an object.
        
           | hamburga wrote:
           | Side question: how do we know if humans possess qualia?
           | 
           | On the other hand, I think by definition we can be sure that
           | a ML thought process won't ever be similar to a human thought
           | process (ours is tied up with feelings connected to our
           | physical tissues, our breath, etc).
        
           | quotemstr wrote:
           | Reasoning is already emergent in large language models:
           | https://yaofu.notion.site/How-does-GPT-Obtain-its-Ability-
           | Tr...
           | 
           | LLMs can do chain-of-reasoning analysis. If you ask, say,
           | ChatGPT to explain, step by step, how it arrived at an
           | answer, it will. The capability seems to be a function of
           | size. These big models coming out these days are _not_ simply
           | dumb token predictors.
        
           | jameshart wrote:
           | We don't know if humans possess qualia. I also don't know if
           | we should take humans' word for it that they experience
           | 'thought processes'.
        
             | dekhn wrote:
             | that's why I added the second clause: " thought processes
             | similar to those that humans subjectively experience".
             | Because personally I suspect that consciousness, free will,
             | qualia, etc, are subjective processes we introspect but
             | cannot fully explain (yet, or possibly ever).
        
             | maweki wrote:
             | Turing said, that while you never know whether somebody
             | else actually thinks or not, it's still polite to assume.
        
               | jgilias wrote:
               | Maybe silly, but this is how I treat chatGPT. I mean, I
               | don't actually think it's conscious. But the
               | conversations with it end up human enough for me to not
               | want to be an asshole to it. Just in case.
        
               | jameshart wrote:
               | The basilisk will remember this.
        
               | 0xdeadbeefbabe wrote:
               | Pretty sure it's an informational zombie.
        
             | jhedwards wrote:
             | I'm not sure if I'm missing something here, but the fact
             | that I can write my thoughts/thought process down in a form
             | that other people can independently consume and understand
             | seems sufficient proof of their existence to me.
        
               | space_fountain wrote:
               | Large scale language models can do that too (or rather
               | pretend to) and they'll only get better at it
        
             | omarhaneef wrote:
             | You don't know if other humans do, but you know at least
             | one human that does: yourself (presumably).
        
             | [deleted]
        
             | LegitShady wrote:
             | you know you possess qualia, if you did you would think it
             | reasonable to assume that at least some of the species you
             | come from, which exhibits many of the same characteristics
             | in thought and body, probably also possess it, unless you
             | believe yourself to be a highly atypical example of your
             | species.
             | 
             | If you're not sure if you possess qualia, we're back to
             | Descartes.
        
             | goatlover wrote:
             | You don't experience inner dialog? Some people don't, but I
             | assume you dream.
        
         | eternalban wrote:
         | A language model does not have to reason to be able to produce
         | textual matter corresponding to code. For example, somewhere, n
         | blogs were written about algorithm x. Elsewhere, z archives in
         | github have the algo implemented in various languages.
         | Correlating that bit of text from say wiki and related code is
         | precisely what it has been doing anyway. Remember: it has no
         | sense of semantics - it is "tokens" all the way down. So, the
         | fact that _you_ see the code as code and the explanation as
         | explanation is completely opaque to the LLM. All it has to do
         | is match things up.
        
         | johnfn wrote:
         | I'm not the first to say it, but the distinction over whether
         | models do any "actual reasoning" or not seems moot to me.
         | Whether or not they do reasoning, they answer questions with a
         | decent degree of accuracy, and that degree of accuracy is only
         | going up as we feed the models more data. Whether or not they
         | "do actual reasoning" simply won't matter.
         | 
         | They're already superhuman in some regards; I don't think that
         | I could have coded up the solution to that problem in 5
         | seconds. :)
        
           | didericis wrote:
           | I strongly disagree.
           | 
           | Humans have perceptual systems we can never fully understand
           | for the same reasons no mathematical system can ever be
           | provably consistent and complete. We cannot prove the
           | reliability and accuracy of our perception with our
           | perception.
           | 
           | The only thing which suggests the reliability of our
           | perception is our existence. The better ways of perceiving
           | make a better map of reality that makes persistence more
           | likely. Our ability to manipulate reality and achieve desired
           | outcomes is what distinguishes good perception from bad
           | perception.
           | 
           | If data directed by human perception is fed into these
           | systems, they have an amazing ability to condense and
           | organize accurate/good faith but relatively unstructured
           | knowledge that is entered into them. They are and will remain
           | extremely useful because of that ability.
           | 
           | But they do not have access to reality because they have not
           | been grown from it through evolution. That means that
           | fundamentally they have _no error correcting beyond human
           | input_. As systems become increasingly unintelligible due to
           | increasing the scale of the data, these systems are going to
           | become more and more disconnected from reality, and _less_
           | accurate.
           | 
           | Think of how nearly every financial disaster occurs despite
           | increasingly sophisticated economic models that build off of
           | more and more data. As you get more and more abstraction
           | needed to handle more and more data, you get more and more
           | _error_.
           | 
           | There is a reason biological systems tap out at a certain
           | size, large organizations decay over time, most animals
           | reproduce instead of live forever. Errors in large complex
           | systems are what nature has been fighting for billions of
           | years, and tend to compound in subtle and pernicious ways.
           | 
           | Imagine a world in which AI systems are not fed carefully
           | categorized human data, but are operating in an internet in
           | which 5% is AI data. Then 15%. Then 50%. Then 75%. Then what
           | human data there is gets influenced by AI content and humans
           | doubting reality based categorizations because of social
           | pressure/because AI is perceived to be better. Very soon you
           | get self referential systems of AI data feeding AI and
           | further and further distance from original source perception
           | and categorization. Self referential group think is
           | disastrous enough when only humans are involved. If you add
           | machines which you cannot appeal to and are entirely
           | deferential to statistical majorities, which then become even
           | more entrenched self referential statistical majorities, you
           | very quickly become entirely disconnected from any notion of
           | reality.
        
           | trynewideas wrote:
           | I want to be clear that I still find it impressive, in the
           | same way I find Riffusion impressive. If anything, I'm
           | looking at Norvig's pointing out that "the biggest issue is
           | that there is no explanation of why the code is the way it
           | is, no justification for how it works." The model can't and
           | won't; it's an unreasonable expectation, and I can't tell
           | whether Norvig is asking for it in good faith.
           | 
           | If I assume he is, and his proposed suggestions that the
           | model "participate in a conversation that leads to the kind
           | of questions and answers we discussed here, thereby building
           | trust in the program" and "generate documentation or tests
           | that would build trust in the code" are also in good faith,
           | then I maintain that he's still missing a fundamental
           | limitation of these models even as he outlines its shape with
           | great specificity. They literally and demonstrably are
           | incapable of coherently doing what he wants; they can't be
           | trained to engender trust, only to mimic actions that might
           | by generating novel responses based on patterns.
           | 
           | That would still not be reasoning through the problem to
           | engineer a solution, it's just an extremely effective,
           | superhuman con of novel mimicry. Which, again, is still
           | really, _really_ impressive, and even potentially useful, but
           | in a different way than we might want or expect it to be, and
           | in a dangerous way to use as a stable foundation for
           | iteration toward AGI.
        
           | ookdatnog wrote:
           | Reasoning is meaningfully distinct from what the models do
           | now. You can ask ChatGPT to multiply two smallish natural
           | numbers and often it will give the correct answer. Now look
           | what happens if you ask it to multiply two very large
           | numbers:
           | 
           | > To multiply two large numbers, such as 3146746135164684
           | times 468473136468433, we can use the traditional
           | multiplication algorithm. This algorithm involves breaking
           | the numbers down into smaller groups of digits, multiplying
           | each group by the corresponding group of digits in the other
           | number, and then adding up the products to find the final
           | result.
           | 
           | > Here is an example of how to multiply 3146746135164684
           | times 468473136468433 using the traditional multiplication
           | algorithm:
           | 
           | > Break the numbers down into groups of digits. For example,
           | we could group the digits of 3146746135164684 as 314, 6746,
           | 135, 1646, 84 and group the digits of 468473136468433 as
           | 4684, 7313, 6468, 433.
           | 
           | > Multipply each group of digits by the corresponding group
           | of digits in the other number. For example, we would multiply
           | 314 times 4684 to get 1470196, 6746 times 7313 to get
           | 49470158, 135 times 6468 to get 873980, 1646 times 433 to get
           | 708338, and 84 times 1 to get 84.
           | 
           | > Add up the products to find the final result. In this case,
           | the final result would be 1470196 + 49470158 + 873980 +
           | 708338 + 84 = 146845556.
           | 
           | > Therefore, the result of multiplying 3146746135164684 times
           | 468473136468433 using the traditional multiplication
           | algorithm is 146845556.
           | 
           | It's not just that the answer is wrong, is that it's complete
           | nonsense.
           | 
           | Reasoning is a style of thinking that scales. You may be more
           | likely to get the wrong answer in a very long chain of
           | reasoning because at every step you have a nonzero chance of
           | making a mistake, but the mistake is identifiable and
           | explainable. That's why teachers ask you to show your work.
           | Even if you get the answer wrong, they can see at a glance
           | whether you understand the material or not. We can see at a
           | glance that ChatGPT does not understand multiplication.
        
             | johnfn wrote:
             | I don't think I buy this argument. ChatGPT seems to
             | understand how to reason about a large multiplication the
             | same that a 6 or 7 year old might, and I would expect a 6
             | or 7 year old to make similarly large errors. No one claims
             | that 6 or 7 year olds are unable to reason.
        
               | fossuser wrote:
               | Yeah, in the original gpt-3 paper one of the more
               | interesting bits was that it made similar off by one
               | errors a human would make when doing arithmetic (and they
               | controlled for memorized test data).
        
           | nighthawk454 wrote:
           | This is a sort of dangerous interpretation. The point of
           | saying model's "don't do reasoning" is to help us understand
           | their strengths and weaknesses. Currently, most models are
           | objectively trained to be "Stochastic Parrots" (as a sibling
           | comment brought up). They do the "gut feeling" answer. But
           | the reasoning part is straight up not in their objectives.
           | Nor is it in their ability, by observation.
           | 
           | There's a line of thought that if we're impressed with what
           | we have, if it just gets bigger maybe eventually 'reasoning'
           | will just emerge as a side-effect. This is somewhat unclear
           | and not really a strategy per se. It's kind of like saying
           | Moore's Law will get us to quantum computers. It's not clear
           | that what we want is a mere scale-up of what we have.
           | 
           | > Whether or not they do reasoning, they answer questions
           | with a decent degree of accuracy, and that degree of accuracy
           | is only going up as we feed the models more data.
           | 
           | Kind of. They don't so much "answer" questions as search for
           | stuff. Current models are giant searchable memory banks with
           | fuzzy interpolation. This interpolation gives some synthesis
           | ability for producing "novel" answers but it's still
           | basically searching existing knowledge. Not really
           | "answering" things based on an understanding.
           | 
           | As long as it's right the distinction may not matter. But the
           | danger is a "gut feeling" model will _always_ produce an
           | answer and _always_ sound confident. Because that's what it's
           | trained to do: produce good-sounding stuff. If it happens to
           | be correct, then great. But it's not logical or reasonable
           | currently. And worse, you can't really tell which you're
           | getting just by the output.
           | 
           | > Whether or not they "do actual reasoning" simply won't
           | matter.
           | 
           | Sure it will. There's entire tasks they categorically can't
           | do, or worse can't be trusted with, unless we can introduce
           | reasoning or similar.
           | 
           | > They're already superhuman in some regards; I don't think
           | that I could have coded up the solution to that problem in 5
           | seconds. :)
           | 
           | This is superhuman in the way that Google Search is. You
           | couldn't search the entire internet that fast either, but you
           | don't think Google Search "feels the true meaning of art" or
           | anything.
        
             | johnfn wrote:
             | > Kind of. They don't so much "answer" questions as search
             | for stuff. Current models are giant searchable memory banks
             | with fuzzy interpolation. This interpolation gives some
             | synthesis ability for producing "novel" answers but it's
             | still basically searching existing knowledge. Not really
             | "answering" things based on an understanding.
             | 
             | I don't really get this line of reasoning. e.g. I can ask
             | DALL-E to produce, famously, an avocado armchair, or any
             | other number of images which have 0 results on google (or
             | "had" - the armchair got pretty popular afterwards). I can
             | ask ChatGPT, Copilot, etc, to solve problems which have 0
             | hits on Google. It's pretty obvious to me that these models
             | are not simply "searching" an extremely large knowledge
             | base for an existing answer. Whether they apply "reasoning"
             | or "extremely multidimensional synthesis across hundreds of
             | thousands of existing solutions" is a question of
             | semantics. It's also perhaps a question of philosophy, and
             | an interesting one, but practically it doesn't seem to
             | matter.
             | 
             | If you believe there is some meaningful difference between
             | the two, you'd have to show me how to quantify that.
        
             | quotemstr wrote:
             | > There's a line of thought that if we're impressed with
             | what we have, if it just gets bigger maybe eventually
             | 'reasoning' will just emerge as a side-effect. This is
             | somewhat unclear and not really a strategy per se. It's
             | kind of like saying Moore's Law will get us to quantum
             | computers. It's not clear that what we want is a mere
             | scale-up of what we have.
             | 
             | Reasoning ability really does seem to emerge from scale:
             | 
             | https://yaofu.notion.site/How-does-GPT-Obtain-its-Ability-
             | Tr...
        
             | visarga wrote:
             | > There's a line of thought that if we're impressed with
             | what we have, if it just gets bigger maybe eventually
             | 'reasoning' will just emerge as a side effect. This is
             | somewhat unclear and not really a strategy per se.
             | 
             | A recent analysis revealed that training on code might be
             | the reason GPT-3 acquired multi-step reasoning abilities.
             | It doesn't do that without code. So it looks like reasoning
             | is emerging as a side effect of code.
             | 
             | (section 3, long article) https://yaofu.notion.site/How-
             | does-GPT-Obtain-its-Ability-Tr...
        
           | mrguyorama wrote:
           | "Do the models do any actual reasoning" is the difference
           | between your ML blackbox having a child's level of
           | understanding of things where it just repeats what it's been
           | trained on and just "monkey see monkey do" it's way to an
           | output, or whether it's actually mixing previous input and
           | predicting and modeling and producing an output.
           | 
           | There's a bunch of famous research that shows a baby and
           | toddlers have basic understanding of physics. If you give a
           | crawling baby a small cliff but make a bridge out of glass,
           | the baby will refuse to cross it, because it's limited
           | understanding prevents it from knowing that the glass is safe
           | to crawl on and it won't fall.
           | 
           | In contrast older humans, even those with a fear of heights,
           | are able to recognize that properly strong glass bridges are
           | perfectly safe, and they won't fall through them just because
           | they can see through them.
           | 
           | What changes when you go from one to the next? Is it just
           | more data fed into the feedback machine, or does the brain
           | build entirely new circuits and pathways and systems to
           | process this more complicated modeling of the world and info
           | it gets?
           | 
           | Everything about machine learning just assumes it's the
           | first, with no actual science to support it, and further
           | claims that neural nets with back-propagation are fully able
           | to model that system, even though we have no idea how the
           | brain corrects errors in it's modeling and a single neuron is
           | WAY more powerful than a small section of a neural network.
           | 
           | These are literally the same mistakes made all the time in
           | the AI field. The field of AI made all these same claims of
           | human levels of intelligence back when the hot new thing was
           | "expert systems" where the plan was, surely if you make
           | enough if/else statements, you can model a human level
           | intelligence. When that proved dumb, we got an AI winter.
           | 
           | There are serious open questions about neural networks and
           | current ML that the community just flat out ignores and
           | handwaves away, usually pretending that they are philosophy
           | questions when they aren't. "Can a giant neural network
           | exactly model what the human brain does" is not a philosophy
           | question.
        
             | visarga wrote:
             | It all boils down to having some sort of embodiment, or a
             | way to verify. For code it would suffice to let the model
             | generate and execute code, and learn from errors. Give it
             | enough "experience" with code execution and it will learn
             | on its own, like AlphaGo. Generate more data and retrain
             | the models a few times.
        
         | gfodor wrote:
         | Your analogy is reaching to the farthest edge case - one of
         | complete non-understanding and complete mimicry. The problem is
         | that language models _do_ understand concepts for some
         | reasonable definitions of understanding: they will use the
         | concept correct and with low error rate. So all you're really
         | pointing at here is an example where they still have poor
         | understanding, not that they have some innate inability to
         | understand.
         | 
         | Alternatively, you need to provide a definition of
         | understanding which is falsifiable and shown to be false for
         | all concepts a language model could plausibly understand.
        
         | 60secs wrote:
         | This gets back to the simulation / emulation debate of Norvig
         | and Chomsky. Deep language models are essentially similar to
         | sophisticated Markov chains.
         | 
         | http://web.cse.ohio-state.edu/~stiff.4/cse3521/norvig-chomsk...
        
         | PaulHoule wrote:
         | I'm skeptical of "explainable A.I." in many cases and I use the
         | curse words as an example. You really don't want to tease out
         | the thought process that got there, you just want the behavior
         | to stop.
        
         | olalonde wrote:
         | > This is a great review but it still misses what seems like
         | the point to me: these models don't do any actual reasoning.
         | 
         | Hmmm... I have seen multiple examples of ChatGPT doing actual
         | reasoning.
        
         | jvm___ wrote:
         | In my head I picture these models like if you built a massive
         | scaffold. Just boxes upon boxes, enough to fill a whole school
         | gym, or even cover a football field. Everything is bolted
         | together.
         | 
         | You walk up to one side and Say "write me a poem on JVM". The
         | signals race through the cube and your answer appears on the
         | other side. You want to change something, go back and say
         | another thing - new answer on the other side.
         | 
         | But it's all fixed together like metal scaffolding. The network
         | doesn't change. Sure, it's massive and has a bajillion routes
         | through it, but it's not fixed.
         | 
         | The next step is to make the grid flexible. It can mold and
         | reshape itself based on inputs and output results. I think the
         | challenge is to keep the whole thing together, while allowed it
         | to shape-shift. Too much movement and your network looses parts
         | of itself, or collapses altogether.
         | 
         | Just because we can build a complex, but fixed, scaffolding
         | system, doesn't mean we can build one that adapts and stays
         | together. Broken is a more likely outcome than AGI.
        
           | yesenadam wrote:
           | > it's massive and has a bajillion routes through it, but
           | it's not fixed.
           | 
           | I _think_ you meant to write  "but it's fixed."
        
         | [deleted]
        
         | aerovistae wrote:
         | fantastic analogy, A+ if you came up with that
        
         | TreeRingCounter wrote:
         | This is such a silly and trivially debunked claim. I'm shocked
         | it comes up so frequently.
         | 
         | These systems can generate _novel content_. They manifestly
         | haven 't just memorized a bunch of stuff.
        
           | throw_nbvc1234 wrote:
           | Coming up with novel content doesn't necessarily mean it can
           | reason (depending on your definition of reason). Take 3
           | examples:
           | 
           | 1) Copying existing bridges 2) Merging concepts from multiple
           | existing bridges in a novel way with much less effort then a
           | human would take to do the same. 3) Understanding the
           | underlying physics and generating novel solutions to building
           | a bridge
           | 
           | The difference between 2 and 3 isn't necessarily the output
           | but how it got to that output; focusing on the output, the
           | lines are blurry. If the AI is able to explain why it came to
           | a solution you can tease out the differences between 2 and 3.
           | And it's probably arguable that for many subject matters
           | (most art?) the difference between 2 and 3 might not matter
           | all that much. But you wouldn't want an AI to design a new
           | bridge unsupervised without knowing if it was following
           | method 2 or method 3.
        
           | mrguyorama wrote:
           | Children produce novel sentences all the time, simply because
           | they don't know how stuff is supposed to go together. "Novel
           | content" isn't a step forward. "Novel content that is valid
           | and correct and possibly an innovation" has always been the
           | claim, but there's no mathematical or scientific proof.
           | 
           | How much of this stuff is just a realization of the classic
           | "infinite monkeys and typewriters" concept?
        
       | thundergolfer wrote:
       | Always a pleasure to read Norvig's Python posts. His Python
       | fluency is excellent, but, more atypically, he provides such
       | unfussy, attentive, and detailed explanations about why the
       | better code is better.
       | 
       | Re-reading the README, he analogizes his approach so well:
       | 
       | > But if you think of programming like playing the piano--a craft
       | that can take years to perfect--then I hope this collection can
       | help.
       | 
       | If someone restructured this PyTudes repo into a course, it'd
       | likely be best Python course available anywhere online.
        
       | ipv6ipv4 wrote:
       | AlphaCode doesn't need to by perfect, or even particularly good.
       | The question is when AlphaCode, or an equivalent, is good enough
       | for a sufficient number of problems. Like C code can always be
       | made faster than Python, Python performance is good enough (often
       | 30x slower than C) for a very wide set of problems while being
       | much easier to use.
       | 
       | In Norvig's example, the code is much slower than ideal (50x
       | slower), it adds unnecessary code, and yet, it generated correct
       | code many times faster than anyone could ever hope to. An easy to
       | use black box that produces correct results can be good enough.
        
         | alar44 wrote:
         | Absolutely. I've been using it to create Slack bots over the
         | last week. It's cuts out a massive amount of time researching
         | APIs and gives me good enough, workable, understandable
         | starting points that saves me hours worth of fiddling and
         | refactoring.
        
       ___________________________________________________________________
       (page generated 2022-12-16 23:00 UTC)