Predictive Processing is not Cognition

Charlie Munford
10 min readOct 9, 2024

--

At my organization, TalkingOctopus, we are looking for the algorithm of intelligence that animates every living cell. This undiscovered algorithm allows the cell to operate its heritable material as a tool for uncovering knowledge. To understand this project you have to look at life as a learning-first process. This new learning-first model of life has many interesting consequences, but in this letter I want to focus on the consequences that relate to prediction as an epistemological aim. I think predictive processing as a theory is a result of a serious lack of imagination and practicality in the fields of neuroscience and computer science. Below I’ll try to show why I have this intuition.

First, let me clarify my view of the cognition problem. The fundamental insight of this new version of life is that heritable material doesn’t interpret itself; it must be acted upon by intelligence in order to attain any function. This must be true because natural selection happens at the wrong level to do the job. The best example is developing cell lines in an embryo; they share the same heritable materials because they arose from a single cell, therefore those heritable materials cannot logically be the basis for their choices about how they develop with respect to one another. The aims of survival and reproduction can only be communicated to the embryo through heritable materials, but they can’t make a difference between cellular choices…this means that the only influence heritable materials can have on the embryo is to limit the choices that all the cells can make collectively. Natural selection can influence the domain within which all the cells in the body make their decisions, but it cannot influence those decisions and fates individually.

This basic insight rules out Neo-Darwinism, the idea that genes code directly for functions. It also raises a profound problem in explaining why the second law of thermodynamics doesn’t disrupt function. Natural selection happens at the level of the genome and the population, so natural selection cannot touch cognition directly; cognition is an individual-level process. This garauntees that at least part of the entropy problem is solved by an individual-level process. Natural selection can only prevent the likelihood of outcomes of a learning process that are incompatible with life by selecting genetic material over generations in the population. It cannot control the actual outcomes that come about at the level of the individual. That is why we are searching for a new fundamental process that generates intelligence; because the outcomes must be controlled and harmonized by some process at the level of the individual.

The way around this that has been imagined by some theorists who swim on our pond is the idea of constraint closure. In this version of biology, not only genes but all the parts of the organism have been selected on the basis of their ability to influence and constrain the other parts of the organism. In this version, the causal stream that an organism follows is like a river in which all the water molecules have been naturally-selected to flock together like birds, obviating the need for riverbanks. The metaphorical riverbanks are, of course, the constraints that keep entropy at bay and allow the organism to remain organized and alive.

This idea fails because it is even worse than Neo-Darwinism at explaining how entropy is purged from life. It is really just an expansion of the definition of the “gene” to include the other parts of the organism, including the cell membrane, cytoplasm, and organelles of the germline cell (and even the somatic cells.) These parts are not replicated digitally by an error-correcting process as DNA is; they reproduce by self-templating. This means that, absent some other mechanism for correction, any errors would be propagated and whatever function they had would be quickly lost over a few generations. In other words, DNA is the only part for which there is a strong case for constraint closure; all the other parts either don’t get inherited, or they get inherited in a non-error-corrected way.

This is why we are looking for a totally independent mechanism for cognition such as harmonic resonance or the like. We are looking for something that is not configured in relation to the history of inheritance, but arises spontaneously because of what type of physical system an organism is at a basic level. This is the important question, and this is why I gave it a new name, calling it epistolution rather than adaptation or cognition.

I should add what I think DNA actually is. It’s the most rigid memories of the living system, the part that limits how flexible the process of discovery can be. All learning systems are constrained by what they have already learned, by their memories. It’s purely restrictive in its actions, preventing an organism (in a statistically averaged way) from learning in ways that result in its destruction. Since a genome is common to all lineages in a multicellular organism, it can’t determine the specific actions of any of the collective parts, but it can determine in what domain the entire collective system operates by affecting its startup memory. If that sounds a little abstract thats because it is.

At the level of the individual, survival and reproduction cannot exist as goals. Neither the chances of death nor prospects of reproduction are directly cognizable by the individual because they are not in an individuals domain. They only live once, they cannot learn from death. At a very high level of intelligence organisms like us may have a vague, abstract, very weak attachment to these aims, but only by imagining them in the abstract, a task that probably doesn’t extend very far down the tree of life. In other words, individual learning happens within boundaries set by these population-level natural-selective pressures, but the organism does not receive any signal from them while it is learning.

This means the most basic test for whether we have the right mechanism for cognition is whether it aims at survival and reproduction. If it does, it’s certainly and irrevocably sure to be the wrong mechanism. If it does not, it could possibly be a candidate.

I suspect that predictive processing was devised as an explanation for cognition because scientists needed an aim for cognition that met exactly the opposite test. They were looking for a mechanism that could possibly embody and be identical with the aim of survival and reproduction because they assumed that the basic self-organization of an organism is generated by genetic inheritance. This is grounds for deep suspicion. Nevertheless, it is possible to see past this objection and to imagine that prediction is not a result of inheritance-derived incentives. To steel-man the argument I will assume this. Let’s say that prediction was a separate, prior mechanism that spontaneously organized the first selves and caused inheritance to begin.

Let’s use a general definition from Christopher J. Whyte, Consciousness and Cognition, 2019: “predictive processing aims to subsume perception, action and cognition under one computational umbrella: a Bayes approximate process of prediction error minimisation. According to this view, what the brain ultimately does is minimize the difference between predictions and sensory signals.”

The first objection to this scheme is that no organism does actually seek out stable, unchanging conditions where these differences would be minimized. This is the “dark room” problem, i.e. why don’t we just go into a dark room where predicted conditions are always met and confirmed? Some defenders of PP have argued that the dark room problem might be solved by incorporating the internal states of the organism into the model. How so? If prediction, action, and cognition are one, then the internal states are likewise embodied actions aimed at predicted outcomes. An organism should then be modifying its internal function in such a way as to make that function as predictable as possible, creating not only a motivation to seek out a dark room in the external environment but to create a figurative dark room within itself as well. That hardly matches observed results. If this were true we would not only be in dark room, but we would also be sipping liquid media out of a straw at a constant rate, breathing shallowly, and defecating a little with each breath.

An even more serious problem is that we do not think in data but in stories. Our minds create causal entities and forces that interact in our imaginations, and then we compare that to what actually happens in our perceptions. This is why optical illusions, behavioral economics and all sorts of cognitive fallacies arise. How do we create the entities that populate our mental stories? If it were by Bayesian probabilities, these entities could only do things we had actually observed, in fact they could only do average versions of things we had observed many, many times.

To get anything like intelligence, LLMs have had to consume basically the whole internet. Instead of massive online data, we operate with super-sparse idiosyncratic unlabeled training data, data that is corrupted by our intervention in it, and still we create very quick, metaphorical associations that transcend our local experiences. My stepson, after seeing only plastic toy cows in his living room, was able to spontaneously look out of a moving car window one day at a distant field, point his finger, and say, “Moo.”

Bayesian systems do not test and revise causal stories with their embodied interactions. They cannot because they are oriented to expect the future to be in every way a more routinized extension of the past. Because of the causal force of the entities in our minds, we expect the future to be quite unlike the past. We expect and imagine events like solar eclipses or our own deaths, events which we never could have any experience of and yet have firm beliefs about.

This makes me think that the creators of the PP hypothesis really just did not imagine the problem of creating meaning. They presupposed that data could be in a meaningful form (and that computers could think) without doing the hard intellectual work of trying to make the idea work. They did not grapple with the problem of make-believe. If I am taking PP seriously, then I am expecting that all the behaviors of an organism are aimed at achieving a match between expectation and results. This means that young organisms, whose expectations are less reliable, should be more robotic and routinized in their explorations of reality since they have less accurate maps of it. They should be highly risk averse, deviating only in small degrees from what they have done or seen before. PP expects a young organism to be more prone to rigid routine and to also be a resource hoarder, both of which are antithetical to the freewheeling imaginative play of children and young animals.

What would be a PP version of language? Instead of discrete manipulable symbols that convey causal stories, in PP language would be an infinite game of auto-complete. There would be no meaningful disputes or disagreements or theories because the symbolic domain would only be an empty statistical artifact of the solipsistic sense-data experiences of one individual. We would have no ability to influence one another semantically, to change each other’s conceptions of reality. We could not become an instructive part of one another’s lives in any way other than the rather neutered role of becoming one among many required examples of a repeating stimulus to be anticipated. A fairy tale, in PP world, would have no moral at the end; it could only convey the meaning of the experience in which it had been heard, not the experience that was represented and imagined in the fictive world that it created.

As you have already intuited I think, there is a fundamental epistemological problem here. According to David Deutsch and the critical rationalists, knowledge consists of good explanations. In their view the aim of natural learning in humans is neither association of one event with another nor prediction of the future, but the explanation of the past. Knowledge about the past allows for anticipation of events in the future that are quite unlike anything that has ever occurred before. This is very different from statistical association and prediction.

Nothing about the past logically entails that the future will resemble it. This is the “problem of induction;” it was discovered by David Hume. Hume argued that it is illogical to assume that events in the past allow prediction of the future. His famous example was that by examining the swans of Europe, one might have developed the notion that all swans were white. Nothing within this experience would have prepared Europeans for the discovery of black swans in New Zealand. Indeed, if generalizing from past examples was the real basis of their knowledge of birds, it would have been impossible for them to recognize that a black bird could even be a swan.

This problem was a major obstacle in the philosophy of science until Karl Popper developed his theory of knowledge in the 20th century. Popper suggested that knowledge is formed not by logical induction but by a process of conjecture and refutation. The truth can never be finally established, but good explanations can be conjectured and then subjected to reasonable discussion and empirical testing so as to be logically eliminated. In this way science can approach the truth gradually, without ever arriving at a final destination. A good explanation is parsimonious, logical, informative, and testable, a set of requirements that David Deutsch has summarized as “hard to vary.”

These are some of the reasons why I feel we need to mostly explore mechanisms that are unrelated to the concept of PP. Explanation is not prediction. You can probably surmise from these thoughts above why I have been so interested in Steven Lehar’s harmonic gestalt. Harmonic gestalt looks to me like a bridge from perception to understanding. It looks like a mechanism that extends a percept into an entity that can be reified, rotated and extended in an abstract way, in exactly the way that the entities and forces in our causal stories of reality are reified, rotated and extended. It looks like a physical mechanism of building thoughts, not just predictive data structures. I won’t go into all the congruent concepts in this letter, but the possibility that we might find a unified mechanism for transforming perception into abstract explanation is the hope that has kept me laser-focused on this problem for six years. If we did, it would change everything.

--

--

Charlie Munford
Charlie Munford

Written by Charlie Munford

Charlie Munford is a writer based in New Orleans who explores the meaning of living systems and the boundaries of our ecological knowledge.

No responses yet