[HN Gopher] Gato - A Generalist Agent
       ___________________________________________________________________
        
       Gato - A Generalist Agent
        
       Author : deltree7
       Score  : 61 points
       Date   : 2022-05-17 19:45 UTC (3 hours ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | izzygonzalez wrote:
       | I made some concept maps of the first parts of the paper. It
       | might help with clarifying some of it.
       | 
       | https://twitter.com/izzyz/status/1525099159925116928
        
       | efitz wrote:
       | Roko's Basilisk indicates that we all ought to support this
       | project as much as possible.
        
       | tootyskooty wrote:
       | One important thing to note here is that this model was trained
       | purely in a supervised fashion. It would be interesting to see a
       | paper at a similar scale that's based on reinforcement learning.
       | The reinforcement learning context (specifically the exploring
       | part) gives a lot more opportunities to see the effects of
       | positive/negative transfer. That approach would of course be much
       | more expensive, though.
        
       | ruuda wrote:
       | This paper caused quite a big shift in the Metaculus predictions
       | on when "AGI" will be achieved,
       | https://www.metaculus.com/questions/3479/date-weakly-general...
       | and https://www.metaculus.com/questions/5121/date-of-general-ai/.
        
       | hans1729 wrote:
       | This, again, sparks the "is this general ai?" question, which
       | often results in low quality, borderline-flaming content... My
       | take:
       | 
       | the point of this paper isn't "here, we solved general
       | intelligence". It's "look, multi modal token prediction is a
       | sound iteration". Look at the scale of the model in comparison
       | to, say, gpt-3: this is a PoC, they didn't bother scaling it,
       | because we've already seen where scaling these mechanisms leads.
       | 
       | What _I_ would love to know is what kind of architectures
       | deepmind et al are playing with in-house. Token prediction is a
       | promising avenue, but it 's more of a language that an
       | intelligent agent may operate in, opposed to the self-sufficient
       | structure of the intelligent agent itself -- the _symbolic
       | system_ that implements algos like gato. If that symbolic system
       | will be the result of a generator-function, that generator
       | function won 't be token prediction by trade. I mean, maybe
       | somewhere in the deep depths of a multi modal model, intelligent
       | structure may emerge, but that would be a very weird byproduct.
        
         | sva_ wrote:
         | > because we've already seen where scaling these mechanisms
         | leads.
         | 
         | In the case of GPT-3, scaling seemed to continuously improve
         | results, they just kinda ran out of data. Are you implying this
         | must be the same for this model? Or were you intending to say
         | something different that I didn't see?
        
         | Barrin92 wrote:
         | >but it's more of a language that an intelligent agent may
         | operate in, opposed to the self-sufficient structure
         | 
         | yes, this kind of functional intelligence seems distinct from
         | an actual living entity, which is the thing that uses
         | subordinate functions to pursue goals and has some interior
         | state, motivations and some sort of architecture. To reduce
         | intelligence to tokens predicting more tokens is kind of like
         | saying f(x), just solve for intelligence. When prediction
         | itself is only partially what intelligent systems are about.
         | 
         | Agent is a very important word because it's accurate ( _" a
         | means or instrument by which a guiding intelligence achieves a
         | result_") And it's the latter I think we ought to be after when
         | talking about 'general ai'.
        
           | jawarner wrote:
           | It's possible that in serving the function of prediction, the
           | model forms a complex internal representation akin even to
           | goals, motivations, etc. It is true that DL architectures are
           | not explicitly designed to do this, not yet anyway. But my
           | point is that the task of prediction can give rise to such
           | architectural patterns. According to Karl Friston's Free
           | Energy Principle, biological brains serve the purpose of
           | predicting the value of different actions available.
        
       | version_five wrote:
       | Discussed a lot five days ago:
       | https://news.ycombinator.com/item?id=31355657
        
       | zackees wrote:
       | This is essentially the birth of AI.
       | 
       | The lack of fanfare on this achievement is baffling.
        
         | natly wrote:
         | You're hanging out in the wrong (or right) circles if that's
         | your perception.
        
           | standardly wrote:
           | So.. Any circles?
        
         | mrtranscendence wrote:
         | I disagree. It's not even clear from the paper exactly how much
         | learning transfer is actually happening. I think it's fair not
         | to be rolling out the red carpet and showering the authors with
         | awards.
        
         | joshcryer wrote:
         | This result is unsurprising. "Give a model a bunch of unique
         | datasets and it can do a bunch of unique things." There's
         | nothing showing any sort of generalized learning or capability
         | here.
        
         | megaman821 wrote:
         | What is the achievement? It seems that the author has shown
         | that this path is fruitful, but transfer learning is no where
         | near being solved.
        
         | jjoonathan wrote:
         | Lack of fanfare? Every techie news outlet is plastered with it,
         | and I'd expect it to diffuse from there.
        
       | deltree7 wrote:
       | https://www.deepmind.com/publications/a-generalist-agent
        
       | gallerdude wrote:
       | There's a breakthrough that I've been waiting for that I haven't
       | heard anything about: when will an AI agent (probably a language
       | model) discover something scientific that humans had not at the
       | time it was trained. What if there was a math proof, physics
       | interaction, ... that emerged from the model's approximation of
       | our world?
       | 
       | Right now, the state of the art AlphaZero models can destroy
       | humans at Go. But what if the machine learning models could teach
       | us things about how Go works that humans have not yet discovered.
        
         | SemanticStrengh wrote:
         | Narrow deep learning ai is generally not suited for this.
         | However automated theorem provers are a thing and have proven
         | major conjectures/theorems that weren't solved by humans
         | before. E.g. The four color problem IIRC. Although the best
         | results are generally obtained with semi-automated theorems
         | provers
         | 
         | But still, this is not cleverness, this just show that raw
         | bruteforce + a few tricks can solve a few problems, by
         | generating proofs of multiple terabytes(yes this is absurd
         | scaling). The asymmetry between compute power and computer lack
         | of intelligence is remarkable.
         | 
         | https://en.m.wikipedia.org/wiki/Automated_theorem_proving
        
         | hans1729 wrote:
         | It very likely already did, specifically in Go. The problem is
         | that humans would still be required to comprehend what they are
         | seeing :-) letting agents develop strategies in an unsupervised
         | manner has already yielded strategies we haven't figured out
         | ourselves. Other examples that come to mind are video
         | compression (see twominutepapers) and proteine folding!
         | 
         | Think about it like this: if the domain of a problem we want AI
         | to solve is so complex that we can barely formulate the
         | question, how could we be confident that we can understand 100%
         | of the answer we get? "Here, gpu, make sense of this
         | 20-dimensional problem my brain can't even approximately
         | visualize!"
        
         | axg11 wrote:
         | You are describing most successful machine learning models.
         | Take AlphaFold, it has surely discovered relationships that
         | govern protein folding better than any human has ever
         | previously understood.
        
       ___________________________________________________________________
       (page generated 2022-05-17 23:00 UTC)