[HN Gopher] Transformers seem to mimic parts of the brain
       ___________________________________________________________________
        
       Transformers seem to mimic parts of the brain
        
       Author : theafh
       Score  : 104 points
       Date   : 2022-09-12 14:26 UTC (8 hours ago)
        
 (HTM) web link (www.quantamagazine.org)
 (TXT) w3m dump (www.quantamagazine.org)
        
       | ericbarrett wrote:
       | The first thing I thought of after reading this was the "mind
       | palace" technique used since antiquity for memorizing things:
       | https://en.m.wikipedia.org/wiki/Method_of_loci
        
       | data_maan wrote:
       | This whole area of research is more hype than substance.
       | 
       | First note that what they tout as being "the brain" is actually
       | just a very very simplified model of the brain. If you really
       | want to model the brain there's a hierarchy of ever more
       | complicated spiking neural nets. Even simulating a single synapse
       | in full detail can be challenging.
       | 
       | Having said that, the fact that some models used in practice have
       | been found to be equal to a neuroscientific model is not really
       | that impressive since it does neither explain the inner working
       | of the brain, nor modern ML models. Unfortunately, Quanta
       | magazine editors are riding too much on the hype wave to notice
       | that.
       | 
       | Note also that Whittingon's other work on predictive coding
       | networks also is not really solid. It was a pretty irritaring
       | experience to read some of his work. That makes me skeptical of
       | how rigorous his claims are in this case.
        
         | westoncb wrote:
         | > Even simulating a single synapse in full detail can be
         | challenging
         | 
         | I wonder how common it is for serious commentators on NN /
         | brain relationship to hold this supposition that in order for
         | the two to be called... let's say "functionally
         | similar/equivalent", that there would have to be some kind of
         | _structural equivalence_ in their most basic parts.
         | 
         | Neurons (and their synaptic connections etc.) developed in a
         | biochemical substrate which is going to bring a certain amount
         | of its own representational baggage with it, i.e. elements
         | which are loosely incidental to "what really matters" in
         | creating the magic of the brain--and we should not expect those
         | features to reappear in artificial NNs (as they are by
         | definition incidental): bringing them up could only establish a
         | trivial non-equivalence imo.
         | 
         | I'd like to see more discussions about NN / brain relationship
         | mentioning which level/kind of equivalence they're
         | refuting/confirming when refuting/confirming.
        
       | origin_path wrote:
       | I do wonder about the ethical issues that people are going to
       | start facing in, I don't know, maybe 10-20 years? Maybe even
       | sooner.
       | 
       | These new DNNs create _very_ human like outputs. And their
       | structure is rather similar to the brain. Now we are learning
       | that maybe they 're even more brainlike than we thought. At what
       | point do we cross a threshold here and encounter a non-trivial
       | number of people who argue that a sufficiently large model
       | actually _is_ a brain, and therefore deserves rights? Blake
       | Lemoine went there already but I wonder if that 's going to be an
       | isolated incident or a harbinger of things to come.
       | 
       | It feels weird. On one hand, I know that these things are just
       | programs. On the other hand, I also know that our brains are just
       | molecules and cells. At the same time I feel that intelligent
       | creatures deserve rights and the exact way the underlying brain
       | works doesn't seem especially important. Especially if people
       | start getting brain augmentations at scale a la Neuralink. At
       | what point do you say the line is crossed?
        
         | heyitsguay wrote:
         | There are no actual ethical issues with contemporary ML
         | architectures (including transformers) being too conscious or
         | brainlike, it's just laypeople and demagogue chatter. It could
         | be an issue in the future but only with computational systems
         | that are utterly unlike the ones being created and used today.
         | 
         | Actual practitioners are tired of the debate because getting to
         | an informed answer requires grokking some undergraduate-level
         | math and statistics, and nobody seems particularly willing to
         | do that
        
           | data_maan wrote:
           | > nobody seems particularly willing to do that
           | 
           | This. The amount of mathematical illiteracy is staggering in
           | ML.
        
             | heyitsguay wrote:
             | The professionals I've met actually working in ML R&D have
             | basically all been very technically competent, including in
             | mathematics. It's more the people who talk a lot about AI
             | in grandiose and anthropomorphized terms that I was
             | referring to.
        
               | data_maan wrote:
               | I work at one of the best unis in the world in a big ML
               | research group and I have not. Unfortunately.
               | 
               | I even know researchers with 10k+ citations in ML that
               | even talk about "continuous maths" and "discrete maths".
               | This pretty much sums up their level of mathematical
               | sophistication and ability.
        
               | heyitsguay wrote:
               | What do you mean? That's an incredibly important
               | distinction in understanding mathematics for ML in the
               | neural net age. Perhaps a bit of a sensitive spot for me
               | personally, coming from a harmonic analysis group for my
               | math PhD, but the short version basically goes like: Up
               | until the 2010s or so, a huge portion of applied math was
               | driven by results from "continuous math": functions
               | mostly take values in some unbounded continuous vector
               | space, they're infinite-dimensional objects supported on
               | some continuous subset of R^n or C^n or whatever, and we
               | reason about signal processing by proving results about
               | existence, uniqueness, and optimality of solutions to
               | certain optimization problems. The infinite-dimensional
               | function spaces provide intellectual substance and
               | challenge to the approach, while also being limited in
               | applicability to circumstances amenable to the many
               | assumptions one must typically make about a signal or
               | sensing mechanism in order for the math model to apply.
               | 
               | This is all well and good, but it's a heavy price to pay
               | for what is, essentially, an abstraction. There are no
               | infinities, computed (not computable) functions are
               | really just finite-dimensional vectors taking values in a
               | bounded range, any relationships between domain elements
               | are discrete.
               | 
               | In this circumstance, most of the traditional heavy-duty
               | mathematical machinery for signal processing is
               | irrelevant -- equation solutions trivially exist (or
               | don't) and the actual meat of the problem is efficiently
               | computing solutions (or approximate solutions). It's
               | still quite technical and relies on advanced math, but a
               | different sort from what is classically the "aesthetic"
               | higher math approach. Famously, it also means far fewer
               | proofs! At least as apply to real-world applications. The
               | parameter spaces are so large, the optimization
               | landscapes so complicated, that traditional methods don't
               | offer much insight, though people continue to work on it.
               | So now we're just concerned with entirely different
               | problems requiring different tools.
               | 
               | Without any further context, that's what I would assume
               | your colleague was referring to, as this is a well-
               | understood mathematical dichotomy.
        
               | Jensson wrote:
               | > The professionals I've met actually working in ML R&D
               | have basically all been very technically competent,
               | including in mathematics.
               | 
               | Competent at math can mean many different things. Have
               | they taken higher level courses in statistics,
               | probability, optimization and control theory? If not I'd
               | say that they aren't technically competent at math that
               | is relevant to their field, and in my experience most
               | don't know those things.
        
         | petra wrote:
         | At it's heart, ethics isn't a cognitive process, but it's based
         | on an emotional one.
         | 
         | We care about domesticated animals because of Oxytocin(OT), the
         | love/connection hormone:
         | 
         | " Recent reports have indicated the possible contribution of OT
         | to the formation of a social bond between domesticated mammals
         | (dog, sheep, cattle) and humans."[1]
         | 
         | And sure, we'll probably create at some point an artificial
         | creature that will release oxytocin in humans. An it's an
         | interesting question how to design such a machine.
         | 
         | But most AI's ? most likely, we won't feel any strong
         | connection with them.
         | 
         | [1]https://link.springer.com/article/10.1134/S2079059717030042
        
           | hooverd wrote:
           | They're called vtubers.
        
         | quonn wrote:
         | Feeling creatures deserve rights, since they can suffer.
         | Intelligence is at best indirectly related, if at all.
        
         | ok_dad wrote:
         | I personally think we'll have the opposite problem. I think
         | making machines too much like humans will result in less
         | ethical consideration of actual humans, not extra rights for
         | machines.
         | 
         | Once a model becomes sufficiently like a human brain to perform
         | many of the jobs we have in our society (driving vehicles,
         | collecting trash, monitoring cameras and data streams, etc.)
         | then those who have the ultimate power in our world will start
         | to see humans as unimportant meat sacks that are inefficient
         | compared to the machines. They'll stop pretending they care
         | about the welfare of humanity even a little bit, and will start
         | to push for policies that would reduce our numbers.
         | 
         | Eventually, I believe the end goal will be for a small number
         | of humans to do those jobs that simply cannot be automated, and
         | to serve an even smaller number of masters who control the
         | machines, which will then be purposed as the police for the
         | other humans.
        
           | kevinventullo wrote:
           | You write in the future tense, but I think that this is
           | already happening. Fewer millenials are having children than
           | previous generations, and those that are have fewer children.
           | 
           | I think the main driver is that most millenials simply can't
           | afford children, and this is a direct result of policies
           | pushed by "those who have the ultimate power".
        
             | Noumenon72 wrote:
             | Or maybe the millennials themselves see humans as
             | unefficient meat sacks who might as well just pass the time
             | with video games rather than striving.
             | 
             | Or maybe people actually think every child is precious, and
             | those who have the ultimate power benevolently used their
             | power to require every child get the resources it deserves,
             | even though we can't afford it.
        
           | doliveira wrote:
           | Tech bros are already shamelessly talking about "artificial
           | wombs", so... The Overton window is already open to that
           | point
        
           | behnamoh wrote:
           | That's like an interesting movie/show plot that I'd love to
           | watch!
        
         | aaaaaaaaaaab wrote:
         | >At what point do we cross a threshold here and encounter a
         | non-trivial number of people who argue that a sufficiently
         | large model actually is a brain, and therefore deserves rights?
         | 
         | There are two ways of resolving this conundrum. You either give
         | rights to computer programs, or take away the rights of people.
         | Which option sounds more likely to you?
        
       | vanderZwan wrote:
       | Sincere question inspired purely by the headline: how many
       | important ML architectures aren't in some way based on some
       | proposed model of how something works in the brain?
       | 
       | (Not intended as a flippant remark, I know Quanta Magazine
       | articles can generaly safely be assumed to be quality content,
       | and that this is about how a language model unexpectedly seems to
       | have relevance for understanding spatial awareness)
        
         | edgyquant wrote:
         | It is my opinion that pretty much all architectures already
         | exist in the brain for some use or another. Otherwise we
         | wouldn't be able to reason about them
        
           | dunefox wrote:
           | > Otherwise we wouldn't be able to reason about them
           | 
           | That's a bold claim.
        
           | coldtea wrote:
           | We can reason about all kinds of things that don't exist in
           | the brain...
        
           | tomrod wrote:
           | I disagree. No one built cars in 5000 BC but the confluence
           | of ideas then have led to cars now.
        
           | hackernewds wrote:
           | I agree but not for the "otherwise" reasoning
           | 
           | I think it speaks to the complexity of the brain almost like
           | every combination of numbers exists in pi
        
         | mtlmtlmtlmtl wrote:
         | AlphaGo is a pretty good example here. It uses a neural net for
         | evalutation, and that's vaguely inspired by the brain, sure.
         | But it employs a Monte Carlo based game tree search which is
         | probably very different from how humans think.
         | 
         | In addition it learns by iterated amplification and
         | distillation: it plays games against itself, where one player
         | gets more time, hence will be a stronger player(amplification).
         | The weaker player then uses this stength differential as a
         | fitness function to learn(distillation). Rinse and repeat.
         | That's really nothing like how humans learn these games. While
         | playing stronger players and evaluating is a huge part of
         | becoming stronger, there's also a lot of targeted exercises,
         | opening/endgame theory, etc. Humans can't really do that type
         | of training at all.
        
           | kadoban wrote:
           | > But it employs a Monte Carlo based game tree search which
           | is probably very different from how humans think.
           | 
           | It's not _that_ different from how humans play. We have
           | pattern matching that points out likely places, we read out
           | what happens and try to evaluate the result. Humans are just
           | less methodical at it really.
        
             | mtlmtlmtlmtl wrote:
             | I mean I guess you could argue some calculation kind of
             | looks like a type of random walk (with intuited moves)
             | based search. But that's kind of all AlphaGo does, and it
             | does it so efficiently that's all it really needs to do.
             | 
             | I'm not a go player, but at least in chess, which is game
             | theoretically very similar modulo branching factor, human
             | thinking is much more of a mish mash of different search
             | methods, different ways of picking moves, and strategic
             | ideas(which I like to think of as sort of employing
             | something more akin to A* or Dijkstra).
             | 
             | I.e there's a rough algorithm like this happening
             | 
             | 1. Asses the opponent's last move, using some sort of
             | abductive reasoning to figure out what the intent was and
             | whether there's a concrete threat. If so, try to refute the
             | threat(This can sometimes be a method of elimination
             | search(best node search is a similar algorithm) if the
             | candidate moves are few enough, or a more general one if
             | not), find counterplay, find the lesser evil, or resign
             | 
             | 2. If not, do you want to stop their plan or is it just a
             | bad plan?
             | 
             | 3. If you do, how?
             | 
             | 4. If not, do you have any tactical ideas? search all the
             | forcing moves in some intuitive order of plausibility and
             | play the strongest one you find
             | 
             | 5. If not, what is _your_ plan? If you had a plan before,
             | does it still make sense?
             | 
             | 6. If not, find a new plan
             | 
             | 7. Once you have a plan, how do you accomplish it? Break it
             | into subgoals like "I want to get a knight to e5"
             | 
             | 8. find the shortest route for a knight to get to
             | e5(pathfinding while ignoring the opponent)
             | 
             | 9. is there a tactical issue with that route?
             | 
             | 10. rinse and repeat until you find the shortest route that
             | works tactically.
             | 
             | I could probably elaborate this list for hours, getting
             | longer and longer. But you probably get the idea at this
             | point.
        
               | kadoban wrote:
               | You are definitely right that computer players are
               | missing some kind of narative-based reasoning for their
               | own moves and for their oppoents' moves. In go it doesn't
               | feel that extreme though. We're taught not to hold too
               | hard to our plans anyway, and most good moves from the
               | opponent will have more than one intention. So you can't
               | get that far relying just on reading what their goal is.
               | 
               | How computers think isn't exactly how we do, for go, but
               | it's close enough to rhyme pretty heavily imo.
        
             | svnt wrote:
             | Parent said "different from how humans think", not play,
             | which seems key. Your description is very broad.
             | 
             | These machines don't seem to carry narratives or plans yet
             | (if they would benefit from them or be encumbered by them
             | seems to be an open question).
             | 
             | Watching the machines play they have zero inertia. If the
             | next opportunity means a completely inverted play strategy
             | has a marginally better chance of winning, they will switch
             | their entire approach.
             | 
             | Humans don't typically do this, although having learned
             | from machines that it can produce better outcomes perhaps
             | we will start moving away from this local maximum.
        
               | kadoban wrote:
               | > Watching the machines play they have zero inertia. If
               | the next opportunity means a completely inverted play
               | strategy has a marginally better chance of winning, they
               | will switch their entire approach.
               | 
               | In Go, especially at the high level, this isn't that far
               | outside of the norm. In particular, you see players play
               | in other areas (tenuki) at what to a weaker player would
               | look like pretty random times, depending on what's most
               | urgent or biggest.
               | 
               | Computer go players aren't too chaotic. They're just
               | _very_ good at some things that are already high-level-
               | player traits. A computer will just give you what you
               | want, but suddenly it's just not actually that good. It
               | feels like Bruce Lee's flow/adaptation based fighting
               | style applied to a go board.
        
         | abeppu wrote:
         | I think the response to this has two prongs:
         | 
         | - Some families of ML techniques (SVMs, random forests,
         | gaussian processes) got their inspiration elsewhere and never
         | claimed to be really related to how brains do stuff.
         | 
         | - Among NNs, even if an idea takes loose inspiration from
         | neuroscience (e.g. the visual system does have a bunch of
         | layers, and the first ones really are pulling out 'simple'
         | features like an edge near an area), I think it's relatively
         | uncommon to go back and compare specifically what's happening
         | in the brain with a given ML architecture. And a lot of the
         | inspiration isn't about human-specific cognitive abilities
         | (like language), but is really a generic description of neurons
         | which is equally true of much less intelligent animals.
        
           | soraki_soladead wrote:
           | > I think it's relatively uncommon to go back and compare
           | specifically what's happening in the brain with a given ML
           | architecture.
           | 
           | Less common but not unheard of. Here's one example, primarily
           | on focused on vision: http://www.brain-score.org/
           | 
           | DeepMind has also published works comparing RL architectures
           | like IQN to dopaminergic neurons.
           | 
           | The challenge is that its very cross-disciplinary and most DL
           | labs don't have a reason to explore the neuroscience side
           | while most neuro labs don't have the expertise in DL.
        
         | macrolocal wrote:
         | Transformer networks have deeper connections to dense
         | associative memory. For example, the update rule to minimize
         | the energy functional of these Hopfield networks converges in a
         | single iteration and coincides with the attention mechanism
         | [1].
         | 
         | [1] https://arxiv.org/abs/1702.01929
        
         | fzliu wrote:
         | Saying that neural networks are "similar to" or that they
         | "mimic" the human brain can be misleading. Today's
         | architectures are the byproduct of years of research and
         | countless GPU-hours dedicated to training and testing
         | architecture variants. Many neuroscience-based architectures
         | that mimic the brain better than transformers end up performing
         | much worse.
         | 
         | The Quanta article is overall pretty reasonable, but I've
         | unfortunately seen other news outlets regurgitate this kind of
         | blanket statement for the better part of a decade. The very
         | first models were perhaps 100% inspired by the brain, but
         | today's ML research more or less follow a "whatever works best"
         | principle.
        
         | rdedev wrote:
         | I think at some point similarities will naturally emerge. Smart
         | moves in design space. That being said these similar designs
         | will probably be minuscule compared to the overall architecture
        
       | hnplj wrote:
        
       | adamnemecek wrote:
       | The underlying idea is the idea of fixed points (aka spectra,
       | diagonalizations, embedding, invariants). By fixed point I mean
       | something like the "Lawvere's fixed point theorem".
       | 
       | https://ncatlab.org/nlab/show/Lawvere%27s+fixed+point+theore...
       | 
       | Dennis Gabor's Holonomic brain theory also indicates something
       | like that https://en.wikipedia.org/wiki/Holonomic_brain_theory
       | 
       | I have a linkdump on this https://github.com/adamnemecek/adjoint
       | 
       | I also have a discord https://discord.gg/mr9TAhpyBW
        
         | data_maan wrote:
         | Why do you invocate a categoric theorem when the theory that is
         | discussed on Quanta has a manifestly non-category-theoretic
         | flavor?...
        
       | intrasight wrote:
       | Next headline of course will be that our brains are now mimicking
       | parts of Transformers.
       | 
       | "This is your brain on Stable Diffusion"
        
       ___________________________________________________________________
       (page generated 2022-09-12 23:00 UTC)