[HN Gopher] Predictive coding has been unified with backpropagation
       ___________________________________________________________________
        
       Predictive coding has been unified with backpropagation
        
       Author : cabalamat
       Score  : 241 points
       Date   : 2021-04-05 12:02 UTC (10 hours ago)
        
 (HTM) web link (www.lesswrong.com)
 (TXT) w3m dump (www.lesswrong.com)
        
       | xzvf wrote:
       | At scale, Evolutionary Strategies (ES) are a very good
       | approximation of the gradient as well. Don't recommend to jump
       | just yet to conclusions and unifications.
        
         | jnwatson wrote:
         | The author's point is that predictive coding is a plausible
         | mechanism by which biological neurons work. ES are not.
         | 
         | ANNs have deviated widely from their biological inspiration,
         | most notably in the way that information flows, since
         | backpropagation requires two way flow and biological axons are
         | one-directional.
         | 
         | If predictive coding and backpropagation are shown to have
         | similar power, then there's a rough idea that the way that ANNs
         | work isn't too far from how brains work (with lots and lots of
         | caveats).
        
           | whimsicalism wrote:
           | > If predictive coding and backpropagation are shown to have
           | similar power, then there's a rough idea that the way that
           | ANNs work isn't too far from how brains work (with lots and
           | lots of caveats).
           | 
           | So many caveats that I don't even really think that is a true
           | statement.
        
       | blueyes wrote:
       | I'm glad people are talking about this, and the similarity
       | between predictive coding and the action of biological neurons is
       | interesting. But we shouldn't fetishize predictive coding.
       | There's a wider discussion going on, and several theories as to
       | how back propagation might work in the brain.
       | 
       | https://www.cell.com/trends/cognitive-sciences/fulltext/S136...
       | 
       | https://www.nature.com/articles/s41583-020-0277-3
        
         | andyxor wrote:
         | there is no evidence of back-propagation in the brain.
         | 
         | See Professor Edmund T. Rolls books on biologically plausible
         | neural networks:
         | 
         | "Brain Computations: What and How" (2020)
         | https://www.amazon.com/gp/product/0198871104
         | 
         | "Cerebral Cortex: Principles of Operation" (2018)
         | https://www.oxcns.org/b12text.html
         | 
         | "Neural Networks and Brain Function" (1997)
         | https://www.oxcns.org/b3_text.html
        
           | ShamelessC wrote:
           | "There is just one problem: [biological neural networks] are
           | physically incapable of running the backpropagation
           | algorithm."
           | 
           | From the linked article.
        
       | 0lmer wrote:
       | But does predictive coding perceived as a valid theory for
       | cortical neurons functioning? There was a paper from 2017 drawing
       | similar conclusions about backprop approximation with Spike-
       | Timing-Dependent Plasticity: https://arxiv.org/abs/1711.04214
       | Looks more grounded to current models of neuronal functioning.
       | Nevertheless, it changed nothing in the field of deep learning
       | since then.
        
         | jwmullally wrote:
         | Some general background on STDP for the thread:
         | 
         | Biological neurons don't just emit constant 0...1 float values,
         | they communicate using time sensitive bursts of voltage known
         | as "spike trains". Spiking Neural Networks (SNN) are a closer
         | aproximation of natural networks than typical ML ANNs. [0]
         | gives a quick overview.
         | 
         | Spike-Timing-Dependant-Plasticity is a local learning rule
         | experimentally observed in biological neurons. It's a form of
         | Hebbian learning, aka "Neurons that fire together wire
         | together."
         | 
         | Summary from [1]. The top graph gives a clear picture of how
         | the rule works.
         | 
         | > _With STDP, repeated presynaptic spike arrival a few
         | milliseconds before postsynaptic action potentials leads in
         | many synapse types to Long-Term Potentiation (LTP) of the
         | synapses, whereas repeated spike arrival after postsynaptic
         | spikes leads to Long-Term Depression (LTD) of the same
         | synapse._
         | 
         | ---
         | 
         | [0]: https://towardsdatascience.com/deep-learning-versus-
         | biologic...
         | 
         | [1]: http://www.scholarpedia.org/article/Spike-
         | timing_dependent_p...
        
         | andyxor wrote:
         | as long as the model requires delta rule, or 'teacher signal'
         | based error correction it is not biologically plausible.
        
       | adamnemecek wrote:
       | I think that this sort of forward backward thing is a very
       | general idea. There's a one to many relationship called the
       | adjoint, and a many to one relationship called the norm.
       | 
       | I wrote something about this here
       | https://github.com/adamnemecek/adjoint
        
         | tsmithe wrote:
         | In fact, the compositional structure underlying that of
         | predictive coding [0,1] is abstractly the same as that
         | underlying backprop [2]. (Disclaimer: [0,1] are my own papers;
         | I'm working on a more precise and extensive version of [1]
         | right now!)
         | 
         | [0] https://arxiv.org/abs/2006.01631 [1]
         | https://arxiv.org/abs/2101.10483 [2]
         | https://arxiv.org/abs/1711.10455
        
           | eli_gottlieb wrote:
           | Hurry and publish before I have manuscripts ready applying
           | these results.
        
             | tsmithe wrote:
             | Hey, Eli :-)
             | 
             | I'm working on it; I'll send you an e-mail. Things quickly
             | turned out to be more general than I realized last year.
        
         | selimthegrim wrote:
         | What were you going to say about Young tableaux?
        
           | adamnemecek wrote:
           | Dynamic programming and reinforcement learning are just
           | diagonalizations of the Young tableau. This is related to the
           | spectral theorem.
        
       | jdonaldson wrote:
       | Yeah, I don't like this title. Coding for backprop is worth
       | getting excited about, but please don't assume it supersedes all
       | forms of "predictive coding". Plenty of predictive learning
       | techniques do just fine without it, including our own brains.
       | 
       | In keeping with the No-Free-Lunch theorem, it's also highly
       | desirable in general to have a variety of approaches at hand for
       | solving certain predictive coding problems. Yes, this makes ML
       | (as a field) cumbersome, but it also prevents us from painting
       | ourselves into a corner.
        
         | nerdponx wrote:
         | Is this "coding for backprop", or "coding for the same results
         | as backprop"?
        
       | klmadfejno wrote:
       | > Predictive coding is the idea that BNNs generate a mental model
       | of their environment and then transmit only the information that
       | deviates from this model. Predictive coding considers error and
       | surprise to be the same thing. Hebbian theory is specific
       | mathematical formulation of predictive coding.
       | 
       | This is an excellent, concise explanation. It sounds intuitive as
       | something that could work. Would love to try and dabble with
       | this. Any resources?
        
       | cs702 wrote:
       | EDIT: Before you read my comment below, please see
       | https://news.ycombinator.com/item?id=26702815 and
       | https://openreview.net/forum?id=PdauS7wZBfC for a different view.
       | 
       | --
       | 
       | If the results hold, they seem significant enough to me that I'd
       | go as far as saying the authors of the paper would end up getting
       | an important award at some point, not just for _unifying the
       | fields of biological and artificial intelligence_ , but also for
       | making it trivial to train models in a fully distributed manner,
       | with _all learning done locally_ -- if the results hold.
       | 
       | Here's the paper: "Predictive Coding Approximates Backprop along
       | Arbitrary Computation Graphs"
       | 
       | https://arxiv.org/abs/2006.04182
       | 
       | I'm making my way through it right now.
        
         | klmadfejno wrote:
         | I'm trying to imagine how that works. Imagine you've got a
         | nueral net. One node identifies the number of feet. One node
         | identifies that number of wings. One node identifies color.
         | This feeds into a layer that tries to predict what animal it
         | is.
         | 
         | With backprop, you can sort of assume that given enough scale
         | your algo will identify these important features. With local
         | learning, wouldn't you get a tendency to identify the easily
         | identifiable features many times? Is there a need for a sort of
         | middleman like a one arm bandit kind of thing that makes a
         | decision to spawn and despawn child nodes to explore the space
         | more?
        
           | TheRealPomax wrote:
           | The fallacy there is the idea that "one node" does anything
           | useful, rather than optimizing itself in a way that you have
           | _no idea_ what it actually codes for, but at the emergent
           | level, you see it contribute to coding for wing detection, or
           | color detection, or more likely actually seventeen different
           | things that are supposedly unrelated, it just happens to be
           | generating values that somehow contribute to a result for the
           | features the various constellations detect.
           | 
           | (meaning it might also actually cause one or more
           | constellations to perform worse than if it wasn't
           | contributing, and realistically, you'll never know)
        
           | SamBam wrote:
           | > Is there a need for a sort of middleman like a one arm
           | bandit kind of thing that makes a decision to spawn and
           | despawn child nodes to explore the space more?
           | 
           | What's the one-armed bandit? (Besides a slot machine.)
           | 
           | My knowledge of this field is rusty, but I actually wrote my
           | MSc thesis on novel ways to get Genetic Algorithms to more
           | efficiently explore the space without getting stuck, so it
           | sounds up my alley.
        
             | fancy_pantser wrote:
             | I wonder if you thought of it as a type of optimal stopping
             | problem locally on each node and explore-exploit (multi-
             | armed bandit) globally? For example, if each node knows
             | when to halt when it hits a [probably local] minima, the
             | results can be shared at that point and the best-performing
             | models can be cross-pollinated or whatever the mechanism is
             | at that point. Since both copying the models and continuing
             | without gaining ground are both wastes of time, you want to
             | dial in that local halting point precisely. An overseeing
             | scheduler would record epoch-level results and make the
             | decisions, of course.
        
         | babel_ wrote:
         | Interesting follow up reading:
         | 
         | "Relaxing the Constraints on Predictive Coding Models"
         | (https://arxiv.org/abs/2010.01047), from the same authors.
         | Looks at ways to remove neurological implausibility from PCM
         | and achieve comparable results. Sadly they only do MNIST in
         | this one, and are not as ambitious in testing on multiple
         | architectures and problems/datasets, but the results are still
         | very interesting and it covers some of the important
         | theoretical and biological concerns.
         | 
         | "Predictive Coding Can Do Exact Backpropagation on
         | Convolutional and Recurrent Neural Networks"
         | (https://arxiv.org/abs/2103.03725), from different authors.
         | Uses an alternative formulation that means it always converges
         | to the backprop result within a fixed number of iterations,
         | rather than approximately converges "in practice" within
         | 100-200 iterations. Not only is this a stronger guarantee, it
         | means they achieve inference speeds within spitting distance of
         | backprop, levelling the playing field.
         | 
         | It'd be interesting to see what a combination of these two
         | could do, and at this point I feel like a logical next step
         | would be to provide some setting in popular ML libraries such
         | that backprop can be switched for PCM. Being able to verify
         | this research just be adding a single extra line for the PCM
         | version, and perhaps replicating state-of-the-art
         | architectures, would be quite valuable.
        
         | abraxas wrote:
         | I'm going to personally flog any researcher who titles their
         | next paper "Predictive Coding Is All You Need". You've been
         | warned.
        
           | cs702 wrote:
           | There are already 60+ of those, and counting, all but one of
           | them since Vaswani et al's transformer paper:
           | 
           | https://arxiv.org/search/?query=is+all+you+need&searchtype=a.
           | ..
        
         | eutropia wrote:
         | Here's a more recent paper (March, 2021) which cites the above
         | paper: https://arxiv.org/abs/2103.04689 "Predictive Coding Can
         | Do Exact Backpropagation on Any Neural Network"
        
           | cs702 wrote:
           | Yup. I'd expect to see many more citations going forward. In
           | particular, I'd be excited to see how this ends up getting
           | used in practice, e.g., training and running very large
           | models running on distributed, masively parallel
           | "neuromorphic" hardware.
        
         | JackFr wrote:
         | My background is as an interested amateur, but
         | 
         | > also for making it trivial to train models in a fully
         | distributed manner, with all learning done locally
         | 
         | seems like a really huge development.
         | 
         | At the same time I remain pretty skeptical of claims of
         | unifying the fields of biological and artificial intelligence.
         | I think the recent tremendous successes in AI & ML lead to an
         | unjustified over confidence that we are close to understanding
         | the way biological systems must work.
        
           | himinlomax wrote:
           | Indeed, it's worth mentioning we still have absolutely no
           | idea how memory works.
        
             | andyxor wrote:
             | we know a lot about memory, but most AI researchers are
             | simply ignorant in neuroscience or cognitive psychology and
             | stick with their comfort zone.
             | 
             | Saying "we have no idea" is just being lazy.
        
         | andyxor wrote:
         | the thing is about every week there is a paper published with
         | groundbreaking claims, with this question in particular being
         | very popular, trying to unify neuroscience and deep learning in
         | some way, in search for computational foundations of AI. Mostly
         | this is driven by success of DL in certain industrial
         | applications.
         | 
         | Unfortunately most of these papers are heavy on theory but
         | light on empirical evidence. If we follow the path of natural
         | sciences, theory has to agree with evidence. Otherwise it's
         | just another theory unconstrained by reality, or worse, pseudo-
         | science.
        
           | autopoiesis wrote:
           | The paper (arxiv:2103.04689) linked by eutropia above has
           | some empirical evidence on the ML side, showing that
           | performance of predictive coding is not so far off backprop.
           | And there is no shortage of suggestions for how neural
           | circuits might work around the strict requirements of
           | backprop-like algorithms.
           | 
           | cs702's original comment above is excessively hyperbolic: the
           | compositional structure of Bayesian inversion is well known
           | and is known to coincide structurally with the
           | backward/forward structure of automatic differentiation. And
           | there have been many papers before this one showing how
           | predictive coding approximates backprop in other cases, so it
           | is no surprise that it can do so on graphs, too. I agree with
           | the ICLR reviewers that this paper is borderline and not in
           | itself a major contribution. But that does not mean that this
           | whole endeavour, of trying to find explicit mathematical
           | connections between biological and artificial learning, is
           | ill motivated.
        
             | eli_gottlieb wrote:
             | >the compositional structure of Bayesian inversion is well
             | known
             | 
             | /u/tsmithe's results on that are _well known_ , now? I can
             | scarcely find anyone to collaborate with who understands
             | them!
        
         | YeGoblynQueenne wrote:
         | Note that the paper was rejected for publication in ICLR 2021:
         | 
         | https://openreview.net/forum?id=PdauS7wZBfC
        
       | hctaw wrote:
       | I don't know enough about biology or ML to know if what I'm
       | posting below is totally wrong, but here goes.
       | 
       | "Backprop" == "Feedback" of a non-linear dynamical system.
       | Feedback is mathematical description of the behavior of systems,
       | not a literal one.
       | 
       | I don't know of BNNs are incapable of backprop anymore than an
       | RLC filter is incapable of "feedback" when analyzing the ODE of
       | the latter tells you that there's a feedback path (which is what,
       | physically? The return path for charge?)
       | 
       | So what makes BNN incapable of feedback? Are they mechanically
       | and electrically insulated from eachother? How do they share
       | information, and what is the return path?
       | 
       | Other than that I wish more unification was done on ML algorithms
       | and dynamical systems, just in general. There's too much
       | crossover to ignore.
        
         | andyxor wrote:
         | The back-prop learning algorithm requires information non-local
         | to the synapse to be propagated from output of the network
         | backwards to affect neurons deep in the network.
         | 
         | There is simply no evidence for this global feedback loop, or
         | global error correction, or delta rule training in
         | neurophysiological data collected in the last 80 years of
         | intensive research. [1]
         | 
         | As for "why", biological learning it is primarily shaped by
         | evolution driven by energy expenditures constraints and
         | survival of the most efficient adaptation engines. One can
         | speculate that iterative optimization akin to the one run by
         | GPUs in ANNs is way too energy inefficient to be sustainable in
         | a living organism.
         | 
         | Good discussion on biological constraints of learning (from
         | CompSci perspective) can be found in Leslie Valiant book [2].
         | Prof. Valiant is the author of PAC [3] one of the few
         | theoretically sound models of modern ML, so he's worth
         | listening to.
         | 
         | [1] https://news.ycombinator.com/item?id=26700536
         | 
         | [2] https://www.amazon.com/Circuits-Mind-Leslie-G-
         | Valiant/dp/019...
         | 
         | [3]
         | https://en.wikipedia.org/wiki/Probably_approximately_correct...
        
           | hctaw wrote:
           | I think there's a significant difference worth illustrating
           | that "there is no feedback path in the brain" is not at all
           | equivalent to "learning by feedback is not possible in the
           | brain."
           | 
           | It's well known in dynamics that feed-forward networks are no
           | longer feed-forward when outputs are coupled to inputs, an
           | example of which would be a hypothetically feed-forward
           | network of neurons in an animal and environmental
           | conditioning teaching it the consequences of actions.
           | 
           | I'm very curious on the biological constraints, but I'd
           | reiterate my point above that feedback is a mathematical or
           | logical abstraction for analyzing the behavior of the things
           | we call networks - which are also abstractions. There's a
           | distinction between the physical behavior of the things we
           | see and the mathematical models we construct to describe
           | them, like electromechanical systems where physically no such
           | coupling from output-to-input appears to exist, yet its
           | existence is crucially important analytically.
        
         | khawkins wrote:
         | > Other than that I wish more unification was done on ML
         | algorithms and dynamical systems, just in general. There's too
         | much crossover to ignore.
         | 
         | Check out this work, "Deep relaxation: partial differential
         | equations for optimizing deep neural networks" by Pratik
         | Chaudhari, Adam Oberman, Stanley Osher, Stefano Soatto &
         | Guillaume Carlier.
         | 
         | https://link.springer.com/article/10.1007/s40687-018-0148-y
        
         | nerdponx wrote:
         | The article says this:
         | 
         | > The backpropagation algorithm requires information to flow
         | forward and backward along the network. But biological neurons
         | are one-directional. An action potential goes from the cell
         | body down the axon to the axon terminals to another cell's
         | dendrites. An axon potential never travels backward from a
         | cell's terminals to its body.
         | 
         | The point of the research here is that backpropagation turns
         | out not to be necessary to fit a neural network, and that it
         | can be approximated with predictive coding, which does not
         | require end-to-end backwards information flow.
        
           | candiodari wrote:
           | Yeah, but then you run into the problem of computation speed.
           | Any given neuron in the middle of your brain does 1
           | computation per second absolute maximum, and 1 per 10 seconds
           | is more realistic. More to the outside (the vast majority of
           | your brain) 1 per 100 seconds is a lot. And it slows down
           | when you age.
           | 
           | This means brains must have a _bloody_ good update rule. You
           | just can 't update a neural network in 1 billion operations
           | per second, or 4e17 operations until you're 12, about 2
           | million training steps per neuron, or about half that
           | assuming you sleep. You cannot get to the level of a 12 year
           | old in 4e17 operations, because GPT-3 does more and while
           | it's impressive, it doesn't have anything on a 12 year old.
        
           | salawat wrote:
           | So... I don't understand.
           | 
           | >An action potential goes from the cell body down the axon to
           | the axon terminals to another cell's dendrites.
           | 
           | How do you figure that doesn't allow backprop?
           | 
           | A neuronal bit is a loop of neurons. Information absolutely
           | can back- propagate. If it couldn't, how does anyone think
           | it'd be at all possible to learn how to get better at
           | anything?
           | 
           | Neuron fires dendrite to axon, secondary neuron fires
           | dendrite to Axon, Axon branches back to previous neuron's
           | dendrites, rinse, repeat, or add more intervening neurons...
           | Trying to disinclude backprop based on the morphology of a
           | single neuron is... Kinda missing the point.
           | 
           | It's all about the level of connection between neurons and
           | how long or whether a signal returns unmodified to the
           | progenitor that effects the stability of the encoded
           | information or behavior. At least to the best I've been able
           | to plausibly model it. Haven't exactly figured out how to
           | shove a bunch of measuring sticks in there to confirm or
           | deny, but I just can't how a uniderectional action potential
           | forwarding element implies lack of backprop in a graph of
           | connections fully capable of developing cycles.
        
       | nmca wrote:
       | Interesting discussion on the ICLR openreview, resulting in a
       | reject:
       | 
       | https://openreview.net/forum?id=PdauS7wZBfC
        
         | justicezyx wrote:
         | Another well received paper [1], but I want to point out that
         | ICLR should really have an industry track.
         | 
         | The type of research in [1] (exhaustive analytic study on
         | various parameters on RL training), is clearly beyond typical
         | academia environment, probably also beyond normal industry
         | labs. Note the paper was from Google Brain.
         | 
         | The study consumes a lot of people's time, and computing time.
         | It's no doubt very useful and valuable. But I dont think they
         | should be judged by the same group of reviewers with the other
         | work from normal universities.
         | 
         | [1] https://openreview.net/forum?id=nIAxjsniDzg
        
         | justicezyx wrote:
         | Copied from this URL, the final review comments that 1)
         | summarized the other reviews, 2) describes the rational for
         | rejection:
         | 
         | ``` This paper extends recent work (Whittington & Bogacz, 2017,
         | Neural computation, 29(5), 1229-1262) by showing that
         | predictive coding (Rao & Ballard, 1999, Nature neuroscience
         | 2(1), 79-87) as an implementation of backpropagation can be
         | extended to arbitrary network structures. Specifically, the
         | original paper by Whittington & Bogacz (2017) demonstrated that
         | for MLPs, predictive coding converges to backpropagation using
         | local learning rules. These results were important/interesting
         | as predictive coding has been shown to match a number of
         | experimental results in neuroscience and locality is an
         | important feature of biologically plausible learning
         | algorithms.
         | 
         | The reviews were mixed. Three out of four reviews were above
         | threshold for acceptance, but two of those were just above.
         | Meanwhile, the fourth review gave a score of clear reject.
         | There was general agreement that the paper was interesting and
         | technically valid. But, the central criticisms of the paper
         | were:
         | 
         | Lack of biological plausibility The reviewers pointed to a few
         | biologically implausible components to this work. For example,
         | the algorithm uses local learning rules in the same sense that
         | backpropagation does, i.e., if we assume that there exist
         | feedback pathways with symmetric weights to feedforward
         | pathways then the algorithm is local. Similarly, it is assumed
         | that there paired error neurons, which is biologically
         | questionable.
         | 
         | Speed of convergence The reviewers noted that this model
         | requires many more iterations to converge on the correct
         | errors, and questioned the utility of a model that involves
         | this much additional computational overhead.
         | 
         | The authors included some new text regarding biological
         | plausibility and speed of convergence. They also included some
         | new results to address some of the other concerns. However,
         | there is still a core concern about the importance of this work
         | relative to the original Whittington & Bogacz (2017) paper. It
         | is nice to see those original results extended to arbitrary
         | graphs, but is that enough of a major contribution for
         | acceptance at ICLR? Given that there are still major issues
         | related to (1) in the model, it is not clear that this
         | extension to arbitrary graphs is a major contribution for
         | neuroscience. And, given the issues related to (2) above, it is
         | not clear that this contribution is important for ML.
         | Altogether, given these considerations, and the high bar for
         | acceptance at ICLR, a "reject" decision was recommended.
         | However, the AC notes that this was a borderline case. ```
         | 
         | The core reason is that the proposed model lacks biological
         | plausibility. Or, if ignoring this weakness, the model is then
         | computationally more intensive.
         | 
         | I HAVE NOT read the paper, but the review seems mostly based
         | "feeling"; i.e., the reviewers feel that this work is not above
         | the bar. Note that I am not criticizing the reviewers here, in
         | my past review career of maybe in the range of 100+ papers,
         | which I did until 6 years ago, most of them are junks. For the
         | ones that are truly good work, which checks all the boxes: new
         | result, hard problem, solid validation, it was easy to accept.
         | 
         | For yet a few other papers, which all seem to fall into the
         | feeling category, everything looks right, but it was always on
         | a borderline. And the review results can vary substantially
         | based on the reviewers' own backgrounds.
        
         | marmaduke wrote:
         | The review is great, it contains all the interesting points and
         | counterpoints, in a much more succinct format than the article
         | itself.
        
       | ilaksh wrote:
       | Does anyone know of a simple code example that demonstrates the
       | original predictive coding concept from 1999? Ideally applied to
       | some type of simple image/video problem.
       | 
       | I thought I saw a Matlab explanation of that 99 paper but have
       | not found it again.
        
       | phreeza wrote:
       | This was already shown for MLPs some years ago, and it is not
       | really that surprising that it applies to many other
       | architectures. Note that while learning can take place locally,
       | it does still require an upward and downward stream of
       | information flow, which is not supported by the neuroanatomy in
       | all cases. So while it is an interesting avenue of research, I
       | don't think it's anywhere near as revolutionary as this blog post
       | makes it out to be.
        
       | AbrahamParangi wrote:
       | This is an overly strong claim for the paper (which is good!)
       | backing it.
       | 
       | If anyone is interested in the reader's digest version of the
       | original paper check out
       | https://www.youtube.com/watch?v=LB4B5FYvtdI
        
       | fouric wrote:
       | > Predictive coding is the idea that BNNs generate a mental model
       | of their environment and then transmit only the information that
       | deviates from this model. Predictive coding considers error and
       | surprise to be the same thing.
       | 
       | This reminds me of a Slate Star Codex article on Friston[1].
       | 
       | [1] https://slatestarcodex.com/2018/03/04/god-help-us-lets-
       | try-t...
        
       ___________________________________________________________________
       (page generated 2021-04-05 23:00 UTC)