Epistolution Musing №10: Shannonization

Charlie Munford
7 min readFeb 14, 2024

Dear Friends,

This letter is part of a weekly series of brief thoughts I would like to share with you, either because I’ve come across your related work in biology or because you’re a person I like. I discovered an interesting problem in 2019, a problem I can’t forget. Epistolution is the unknown biological mechanism that purposefully activates genetic influences and applies them to problems. Clear examples of epistolution include embryonic development, wound healing, regeneration, cancer, learning, memory, creativity, swarm intelligence, epigenetic inheritance, and the placebo effect.

Recap: Having decided that cells are aiming not at S/R but at knowledge, we are now looking at cellular knowledge from different perspectives. I call this the “Five Faces.” Above we covered how causal suspense can provide the motivational logic that makes us curious to interact with certain parts of the world more than others.. Then we discussed how Karl Popper proved that (contrary to the theory of intelligence now ubiquitous in the tech industry) the process of learning logically cannot be conditioned on statistical examples, but has to be developed by a physical process that anticipates causation in a spontaneous way, in the form of an explanation. Now we continue to another deep thinker, Claude Shannon, who created a theory that simultaneously allowed both technical development and a new theoretical confusion about information to flourish.

My five observations about cellular knowledge are that it is:

1. Causal

2. Conjectural

3. Information-binding

4. Open-ended

5. Anti-entropic

This week let’s look at information.

Writers and intellectuals in our knowledge economy nowadays spend a lot of time talking about information and its vital importance. But “information” is an ambiguous word, and like some other popular terms its meaning has been changed without a public announcement. I’m including this essay on digitization here because I’ve noticed people in biological circles with a definition of knowledge that is something like “information shaped by natural selection.” These people see knowledge as a consequence, rather than a cause, of information. My point of view is exactly the reverse. I believe it requires epistolutionary knowledge to bind information and give it meaning.

We can all thank Claude Shannon for this confusion. In 1948, while working at Bell Labs, Shannon published a paper called “A Mathematical Theory of Communication.” The paper was foundational to the way we think about information these days, and it is worth quoting an early passage that clarifies Shannon’s innovation:

The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning; that is they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem. The significant aspect is that the actual message is one selected from a set of possible messages.”

Shannon had discovered that what he called “information” was a mathematically encodeable property, a property that was fully separable from the thorny biological issue of what inside organisms creates meaning and how a subjective understanding develops. For Shannon, trying as he was to reduce static and illegibility in long-distance communication, what really mattered was the abstract property of the symbols themselves, namely that they could be counted, their positions relative to one another defined, and their places held by any arbitrary item in a transmission system. This meant that without deciding whether a transmission meant anything, one could write a mathematical formula describing how much “information” the message contained. The amount of this Shannonist “information” was just, as the passage above describes, the degrees of freedom afforded by the specification of one symbol in each location rather than another. For example, a scientific paper might be written in a Word document that comprises 17KB of data. In this case, each byte is comprised by 8 bits, or binary locations in the file message. This means that in each of 17,000 x 8 or 136,000 positions, a 1 or a 0 is present in the message that could have been otherwise. All this is purely independent of whether or not the paper means a damn thing when you read it.

Thus began a long and confusing moment in history when the commonsense meaning of the word “information,” which was always and is still “meaningful content,” was conflated and obscured by the Shannonist meaning, which is something more like “mathematical degrees of freedom in a positional transmission.” Of course meaningful content can arise without any sort of Shannonizing encodification of anything, like it might have from the odor and taste of a madeleine dipped in tea on spring day in a salon in France, as in the famous scene from Marcel Proust’s Á la recherce du temps perdu. Shannon information, paradoxically, can be entirely devoid of meaningful content while still possessing significant encoded information. There is no difference, in Shannon’s terms, between the message CAKEWALK and the message RAKEWALK. As Shannon articulated, the semantic dimension is irrelevant to the engineering problem of communication.

Today writers not only use these two very different meanings of “information” interchangeably without distinguishing clearly which one they mean, but they often actually construct sentences such that they mean both at one and the same time. When someone writes that there is “information” in the genetic code, there is no way to distinguish whether they mean that each position in the code contains either an A,G,T, or C, i.e. that there is Shannon information encoded in a quaternary scheme within it, or whether they mean that the sequence of nucleotides specifies something meaningful like the design of an RNA that folds a protein. They mean both things, without realizing that all that is “Shannonized” is not meaningful, or that all that is meaningful is not necessarily “Shannonized.”

What I’m calling “Shannonization” is properly known as digitization. A digitized message is composed of templates. If a transmission is to be made reliably, precisely, all of the ambiguity of the process needs to be stripped away. Shannon discovered that if each message was “selected from a set of possible messages,” then the message becomes a code, and a copy of the same coded message can be sent via another route to the recipient. When the recipient receives the copy, they already know how many positions must be occupied by symbols and how many possible symbols can go in each position. This lets the recipient compare the message and copy for accuracy. With this codification, the entire process of communication can be made entirely mechanical and near-perfect fidelity can be enforced. This is now the basis for the internet protocol (IP) system.

Digitization allows outrageously long messages to be faithfully transcribed and reproduced over long distances without much static or error. Combined with electricity over wires and satellite-based wave-transmission at the speed of light, this protocol has resulted in the fantastical digital world of lightning-paced communication that we live in today. It also allowed vast amounts of data to be saved in smaller and smaller physical packages, in hard drives and on microprocessors. This has resulted in the digitization of things which were heretofore analog, like paintings, songs, books, and films.

All this breathtaking new tech in our world makes it quite easy to forget that none of this digitization of everything has gotten us one whit closer to understanding the production of meaning, which was and will always be the true point of all communication. What about a life form allows specified encoded messages to bestow knowledge onto the recipient? We simply don’t know. One thing we do know is that it is the organism that translates the context in which the message occurs into the fruitful exchange of symbols. If the recipient is waiting on Earth to hear about the outcome of a difficult mission on Mars, the message CAKEWALK will come as a welcome and meaningful good news. If a message arrives instead that reads RAKEWALK, quite another set of ideas might arise in the minds back at mission control, ideas that are less clear and instructive. What gives these messages their meaning?

The communication problem is nearly identical in the case of the “code” of the genetic nucleotides. The specification of a protein critical to life can be a very meaningful message, a form of good news for life, you might say, but only if the circumstances are such that this particular protein is in high demand at the moment for the cell. If the protein specified by the AGTTTCAGAC…etc. is unnecessary or already present in toxic quantities, the message would arrive in a condition as pointless and ambiguous as RAKEWALK. Something about the mechanical process taking place in the cell makes the code meaningful. It doesn’t possess this quality inherently by virtue of its being digitized “information.” In order to have any biological salience, meaning must be bestowed upon the digitized gene content by epistolution.

In both cells and computers, errors are fatal. Both the “digitized” genetic code and digitized Shannon information are structured as they are for the purpose of incredibly detailed and faithful error correction. Near-perfect fidelity in replication over an incredibly long series of nucleotide messages is required for the heritable material in a living cell to remain meaningful during transcription with each new generation of daughter cells. Without enormous fidelity the templates in DNA would not make functional molecules of RNA or usable proteins. The same rigorous fidelity has been developed in computer processing architecture for the same purpose. Neither of these processes answers the fundamental question about semantics; they both presuppose a meaningmaker at each end of the message transmission. Life is required both to bind information into a digitized code for communication, and to unbind it into actionable intelligence when it arrives. Just how this binding arrives at a match between the context in which the message is sent and received and the message content itself is still a mystery. The mystery will be resolved with the discovery of the epistolution mechanism.

In the next section we look at Jakob von Uexküll and his theory of the subjective worlds of animals. With the idea of the umwelt, we try to fully absorb the open-ended nature of biological knowledge.

Be Kind, and Be Brave,

Love, Charlie

--

--

Charlie Munford

Charlie Munford is a writer based in New Orleans who explores the meaning of living systems and the boundaries of our ecological knowledge.