[HN Gopher] Transformers seem to mimic parts of the brain ___________________________________________________________________ Transformers seem to mimic parts of the brain Author : theafh Score : 104 points Date : 2022-09-12 14:26 UTC (8 hours ago) (HTM) web link (www.quantamagazine.org) (TXT) w3m dump (www.quantamagazine.org) | ericbarrett wrote: | The first thing I thought of after reading this was the "mind | palace" technique used since antiquity for memorizing things: | https://en.m.wikipedia.org/wiki/Method_of_loci | data_maan wrote: | This whole area of research is more hype than substance. | | First note that what they tout as being "the brain" is actually | just a very very simplified model of the brain. If you really | want to model the brain there's a hierarchy of ever more | complicated spiking neural nets. Even simulating a single synapse | in full detail can be challenging. | | Having said that, the fact that some models used in practice have | been found to be equal to a neuroscientific model is not really | that impressive since it does neither explain the inner working | of the brain, nor modern ML models. Unfortunately, Quanta | magazine editors are riding too much on the hype wave to notice | that. | | Note also that Whittingon's other work on predictive coding | networks also is not really solid. It was a pretty irritaring | experience to read some of his work. That makes me skeptical of | how rigorous his claims are in this case. | westoncb wrote: | > Even simulating a single synapse in full detail can be | challenging | | I wonder how common it is for serious commentators on NN / | brain relationship to hold this supposition that in order for | the two to be called... let's say "functionally | similar/equivalent", that there would have to be some kind of | _structural equivalence_ in their most basic parts. | | Neurons (and their synaptic connections etc.) developed in a | biochemical substrate which is going to bring a certain amount | of its own representational baggage with it, i.e. elements | which are loosely incidental to "what really matters" in | creating the magic of the brain--and we should not expect those | features to reappear in artificial NNs (as they are by | definition incidental): bringing them up could only establish a | trivial non-equivalence imo. | | I'd like to see more discussions about NN / brain relationship | mentioning which level/kind of equivalence they're | refuting/confirming when refuting/confirming. | origin_path wrote: | I do wonder about the ethical issues that people are going to | start facing in, I don't know, maybe 10-20 years? Maybe even | sooner. | | These new DNNs create _very_ human like outputs. And their | structure is rather similar to the brain. Now we are learning | that maybe they 're even more brainlike than we thought. At what | point do we cross a threshold here and encounter a non-trivial | number of people who argue that a sufficiently large model | actually _is_ a brain, and therefore deserves rights? Blake | Lemoine went there already but I wonder if that 's going to be an | isolated incident or a harbinger of things to come. | | It feels weird. On one hand, I know that these things are just | programs. On the other hand, I also know that our brains are just | molecules and cells. At the same time I feel that intelligent | creatures deserve rights and the exact way the underlying brain | works doesn't seem especially important. Especially if people | start getting brain augmentations at scale a la Neuralink. At | what point do you say the line is crossed? | heyitsguay wrote: | There are no actual ethical issues with contemporary ML | architectures (including transformers) being too conscious or | brainlike, it's just laypeople and demagogue chatter. It could | be an issue in the future but only with computational systems | that are utterly unlike the ones being created and used today. | | Actual practitioners are tired of the debate because getting to | an informed answer requires grokking some undergraduate-level | math and statistics, and nobody seems particularly willing to | do that | data_maan wrote: | > nobody seems particularly willing to do that | | This. The amount of mathematical illiteracy is staggering in | ML. | heyitsguay wrote: | The professionals I've met actually working in ML R&D have | basically all been very technically competent, including in | mathematics. It's more the people who talk a lot about AI | in grandiose and anthropomorphized terms that I was | referring to. | data_maan wrote: | I work at one of the best unis in the world in a big ML | research group and I have not. Unfortunately. | | I even know researchers with 10k+ citations in ML that | even talk about "continuous maths" and "discrete maths". | This pretty much sums up their level of mathematical | sophistication and ability. | heyitsguay wrote: | What do you mean? That's an incredibly important | distinction in understanding mathematics for ML in the | neural net age. Perhaps a bit of a sensitive spot for me | personally, coming from a harmonic analysis group for my | math PhD, but the short version basically goes like: Up | until the 2010s or so, a huge portion of applied math was | driven by results from "continuous math": functions | mostly take values in some unbounded continuous vector | space, they're infinite-dimensional objects supported on | some continuous subset of R^n or C^n or whatever, and we | reason about signal processing by proving results about | existence, uniqueness, and optimality of solutions to | certain optimization problems. The infinite-dimensional | function spaces provide intellectual substance and | challenge to the approach, while also being limited in | applicability to circumstances amenable to the many | assumptions one must typically make about a signal or | sensing mechanism in order for the math model to apply. | | This is all well and good, but it's a heavy price to pay | for what is, essentially, an abstraction. There are no | infinities, computed (not computable) functions are | really just finite-dimensional vectors taking values in a | bounded range, any relationships between domain elements | are discrete. | | In this circumstance, most of the traditional heavy-duty | mathematical machinery for signal processing is | irrelevant -- equation solutions trivially exist (or | don't) and the actual meat of the problem is efficiently | computing solutions (or approximate solutions). It's | still quite technical and relies on advanced math, but a | different sort from what is classically the "aesthetic" | higher math approach. Famously, it also means far fewer | proofs! At least as apply to real-world applications. The | parameter spaces are so large, the optimization | landscapes so complicated, that traditional methods don't | offer much insight, though people continue to work on it. | So now we're just concerned with entirely different | problems requiring different tools. | | Without any further context, that's what I would assume | your colleague was referring to, as this is a well- | understood mathematical dichotomy. | Jensson wrote: | > The professionals I've met actually working in ML R&D | have basically all been very technically competent, | including in mathematics. | | Competent at math can mean many different things. Have | they taken higher level courses in statistics, | probability, optimization and control theory? If not I'd | say that they aren't technically competent at math that | is relevant to their field, and in my experience most | don't know those things. | petra wrote: | At it's heart, ethics isn't a cognitive process, but it's based | on an emotional one. | | We care about domesticated animals because of Oxytocin(OT), the | love/connection hormone: | | " Recent reports have indicated the possible contribution of OT | to the formation of a social bond between domesticated mammals | (dog, sheep, cattle) and humans."[1] | | And sure, we'll probably create at some point an artificial | creature that will release oxytocin in humans. An it's an | interesting question how to design such a machine. | | But most AI's ? most likely, we won't feel any strong | connection with them. | | [1]https://link.springer.com/article/10.1134/S2079059717030042 | hooverd wrote: | They're called vtubers. | quonn wrote: | Feeling creatures deserve rights, since they can suffer. | Intelligence is at best indirectly related, if at all. | ok_dad wrote: | I personally think we'll have the opposite problem. I think | making machines too much like humans will result in less | ethical consideration of actual humans, not extra rights for | machines. | | Once a model becomes sufficiently like a human brain to perform | many of the jobs we have in our society (driving vehicles, | collecting trash, monitoring cameras and data streams, etc.) | then those who have the ultimate power in our world will start | to see humans as unimportant meat sacks that are inefficient | compared to the machines. They'll stop pretending they care | about the welfare of humanity even a little bit, and will start | to push for policies that would reduce our numbers. | | Eventually, I believe the end goal will be for a small number | of humans to do those jobs that simply cannot be automated, and | to serve an even smaller number of masters who control the | machines, which will then be purposed as the police for the | other humans. | kevinventullo wrote: | You write in the future tense, but I think that this is | already happening. Fewer millenials are having children than | previous generations, and those that are have fewer children. | | I think the main driver is that most millenials simply can't | afford children, and this is a direct result of policies | pushed by "those who have the ultimate power". | Noumenon72 wrote: | Or maybe the millennials themselves see humans as | unefficient meat sacks who might as well just pass the time | with video games rather than striving. | | Or maybe people actually think every child is precious, and | those who have the ultimate power benevolently used their | power to require every child get the resources it deserves, | even though we can't afford it. | doliveira wrote: | Tech bros are already shamelessly talking about "artificial | wombs", so... The Overton window is already open to that | point | behnamoh wrote: | That's like an interesting movie/show plot that I'd love to | watch! | aaaaaaaaaaab wrote: | >At what point do we cross a threshold here and encounter a | non-trivial number of people who argue that a sufficiently | large model actually is a brain, and therefore deserves rights? | | There are two ways of resolving this conundrum. You either give | rights to computer programs, or take away the rights of people. | Which option sounds more likely to you? | vanderZwan wrote: | Sincere question inspired purely by the headline: how many | important ML architectures aren't in some way based on some | proposed model of how something works in the brain? | | (Not intended as a flippant remark, I know Quanta Magazine | articles can generaly safely be assumed to be quality content, | and that this is about how a language model unexpectedly seems to | have relevance for understanding spatial awareness) | edgyquant wrote: | It is my opinion that pretty much all architectures already | exist in the brain for some use or another. Otherwise we | wouldn't be able to reason about them | dunefox wrote: | > Otherwise we wouldn't be able to reason about them | | That's a bold claim. | coldtea wrote: | We can reason about all kinds of things that don't exist in | the brain... | tomrod wrote: | I disagree. No one built cars in 5000 BC but the confluence | of ideas then have led to cars now. | hackernewds wrote: | I agree but not for the "otherwise" reasoning | | I think it speaks to the complexity of the brain almost like | every combination of numbers exists in pi | mtlmtlmtlmtl wrote: | AlphaGo is a pretty good example here. It uses a neural net for | evalutation, and that's vaguely inspired by the brain, sure. | But it employs a Monte Carlo based game tree search which is | probably very different from how humans think. | | In addition it learns by iterated amplification and | distillation: it plays games against itself, where one player | gets more time, hence will be a stronger player(amplification). | The weaker player then uses this stength differential as a | fitness function to learn(distillation). Rinse and repeat. | That's really nothing like how humans learn these games. While | playing stronger players and evaluating is a huge part of | becoming stronger, there's also a lot of targeted exercises, | opening/endgame theory, etc. Humans can't really do that type | of training at all. | kadoban wrote: | > But it employs a Monte Carlo based game tree search which | is probably very different from how humans think. | | It's not _that_ different from how humans play. We have | pattern matching that points out likely places, we read out | what happens and try to evaluate the result. Humans are just | less methodical at it really. | mtlmtlmtlmtl wrote: | I mean I guess you could argue some calculation kind of | looks like a type of random walk (with intuited moves) | based search. But that's kind of all AlphaGo does, and it | does it so efficiently that's all it really needs to do. | | I'm not a go player, but at least in chess, which is game | theoretically very similar modulo branching factor, human | thinking is much more of a mish mash of different search | methods, different ways of picking moves, and strategic | ideas(which I like to think of as sort of employing | something more akin to A* or Dijkstra). | | I.e there's a rough algorithm like this happening | | 1. Asses the opponent's last move, using some sort of | abductive reasoning to figure out what the intent was and | whether there's a concrete threat. If so, try to refute the | threat(This can sometimes be a method of elimination | search(best node search is a similar algorithm) if the | candidate moves are few enough, or a more general one if | not), find counterplay, find the lesser evil, or resign | | 2. If not, do you want to stop their plan or is it just a | bad plan? | | 3. If you do, how? | | 4. If not, do you have any tactical ideas? search all the | forcing moves in some intuitive order of plausibility and | play the strongest one you find | | 5. If not, what is _your_ plan? If you had a plan before, | does it still make sense? | | 6. If not, find a new plan | | 7. Once you have a plan, how do you accomplish it? Break it | into subgoals like "I want to get a knight to e5" | | 8. find the shortest route for a knight to get to | e5(pathfinding while ignoring the opponent) | | 9. is there a tactical issue with that route? | | 10. rinse and repeat until you find the shortest route that | works tactically. | | I could probably elaborate this list for hours, getting | longer and longer. But you probably get the idea at this | point. | kadoban wrote: | You are definitely right that computer players are | missing some kind of narative-based reasoning for their | own moves and for their oppoents' moves. In go it doesn't | feel that extreme though. We're taught not to hold too | hard to our plans anyway, and most good moves from the | opponent will have more than one intention. So you can't | get that far relying just on reading what their goal is. | | How computers think isn't exactly how we do, for go, but | it's close enough to rhyme pretty heavily imo. | svnt wrote: | Parent said "different from how humans think", not play, | which seems key. Your description is very broad. | | These machines don't seem to carry narratives or plans yet | (if they would benefit from them or be encumbered by them | seems to be an open question). | | Watching the machines play they have zero inertia. If the | next opportunity means a completely inverted play strategy | has a marginally better chance of winning, they will switch | their entire approach. | | Humans don't typically do this, although having learned | from machines that it can produce better outcomes perhaps | we will start moving away from this local maximum. | kadoban wrote: | > Watching the machines play they have zero inertia. If | the next opportunity means a completely inverted play | strategy has a marginally better chance of winning, they | will switch their entire approach. | | In Go, especially at the high level, this isn't that far | outside of the norm. In particular, you see players play | in other areas (tenuki) at what to a weaker player would | look like pretty random times, depending on what's most | urgent or biggest. | | Computer go players aren't too chaotic. They're just | _very_ good at some things that are already high-level- | player traits. A computer will just give you what you | want, but suddenly it's just not actually that good. It | feels like Bruce Lee's flow/adaptation based fighting | style applied to a go board. | abeppu wrote: | I think the response to this has two prongs: | | - Some families of ML techniques (SVMs, random forests, | gaussian processes) got their inspiration elsewhere and never | claimed to be really related to how brains do stuff. | | - Among NNs, even if an idea takes loose inspiration from | neuroscience (e.g. the visual system does have a bunch of | layers, and the first ones really are pulling out 'simple' | features like an edge near an area), I think it's relatively | uncommon to go back and compare specifically what's happening | in the brain with a given ML architecture. And a lot of the | inspiration isn't about human-specific cognitive abilities | (like language), but is really a generic description of neurons | which is equally true of much less intelligent animals. | soraki_soladead wrote: | > I think it's relatively uncommon to go back and compare | specifically what's happening in the brain with a given ML | architecture. | | Less common but not unheard of. Here's one example, primarily | on focused on vision: http://www.brain-score.org/ | | DeepMind has also published works comparing RL architectures | like IQN to dopaminergic neurons. | | The challenge is that its very cross-disciplinary and most DL | labs don't have a reason to explore the neuroscience side | while most neuro labs don't have the expertise in DL. | macrolocal wrote: | Transformer networks have deeper connections to dense | associative memory. For example, the update rule to minimize | the energy functional of these Hopfield networks converges in a | single iteration and coincides with the attention mechanism | [1]. | | [1] https://arxiv.org/abs/1702.01929 | fzliu wrote: | Saying that neural networks are "similar to" or that they | "mimic" the human brain can be misleading. Today's | architectures are the byproduct of years of research and | countless GPU-hours dedicated to training and testing | architecture variants. Many neuroscience-based architectures | that mimic the brain better than transformers end up performing | much worse. | | The Quanta article is overall pretty reasonable, but I've | unfortunately seen other news outlets regurgitate this kind of | blanket statement for the better part of a decade. The very | first models were perhaps 100% inspired by the brain, but | today's ML research more or less follow a "whatever works best" | principle. | rdedev wrote: | I think at some point similarities will naturally emerge. Smart | moves in design space. That being said these similar designs | will probably be minuscule compared to the overall architecture | hnplj wrote: | adamnemecek wrote: | The underlying idea is the idea of fixed points (aka spectra, | diagonalizations, embedding, invariants). By fixed point I mean | something like the "Lawvere's fixed point theorem". | | https://ncatlab.org/nlab/show/Lawvere%27s+fixed+point+theore... | | Dennis Gabor's Holonomic brain theory also indicates something | like that https://en.wikipedia.org/wiki/Holonomic_brain_theory | | I have a linkdump on this https://github.com/adamnemecek/adjoint | | I also have a discord https://discord.gg/mr9TAhpyBW | data_maan wrote: | Why do you invocate a categoric theorem when the theory that is | discussed on Quanta has a manifestly non-category-theoretic | flavor?... | intrasight wrote: | Next headline of course will be that our brains are now mimicking | parts of Transformers. | | "This is your brain on Stable Diffusion" ___________________________________________________________________ (page generated 2022-09-12 23:00 UTC)