Abstract
Shannon's channel coding theorem (1948), a major result of information theory, paradoxically states that errorless communication is possible using an unreliable channel. Since then, engineers developed many error-correcting codes and decoding algorithms. A performance close to the predicted one was eventually achieved no earlier than the beginning of the nineties. Many communication facilities would not exist without error-correcting codes, e.g., mobile telephony and terrestrial digital television. This article explains first how they work without mathematical formalism. An error-correcting code is a minority subset among some set of messages. Within this subset, the messages are sufficiently different from each other to be exactly identified even if a number of their symbols, up to a certain limit, are changed. Beyond this limit, another message can be erroneously identified. An error-correcting code is interpreted as a set of messages subjected to constraints which make their symbols mutually dependent. Although mathematical constraints are conveniently used in engineering, constraints of any other kind, possibly of natural origin, can generate error-correcting codes. Biologists implicitly assume that genomes were conserved during the geological ages, without realizing that this is impossible without error-correcting means. Symbol errors occur during replication of a genome; chemical reactions and radiations are other sources of errors. Their number increases with time in the absence of correction. A genomic code will exactly regenerate the genome provided its decoding is attempted after a short enough time interval. If the number of errors is too large, however, the decoded genome will differ from the initial one and a mutation will occur. Periodically attempted decodings thus will conserve a genome except for very infrequent mutations if decoding attempts are frequent enough. The better conservation of very ancient parts of genomes, like the HOX genes, cannot be explained unless assuming that a genomic error-correcting code resulting from a stepwise encoding exists: a first encoding was followed later by a second one where a new information and the result of the first encoding were jointly encoded, and this process was repeated several times, eventually resulting in an overall code made of nested components where the older is an information, the better it is protected. Organic codes in Barbieri's meaning result from the same process and have the same structure. Any new organic code induces new genomic constraints, hence new components in a nested system of codes. Organic codes may thus be identified with the system of nested error-correcting codes needed to conserve the genetic information. A majority of biologists deny that information theory can be useful to them. It is shown on the contrary that the living world cannot be understood if the scientific concept of information is ignored. Heredity makes the present communicate with the past, and as a communication process is relevant to information theory, which is thus a necessary basis of biology besides physics and chemistry. The nested genomic error-correcting codes which are needed for conserving the genetic information account for the hierarchical taxonomy which structures the living world. Moreover, the main features of biological evolution, including its trend towards increasing complexity, find an explanation within this framework. Incorporating the scientific concept of information and the science based on it in the foundations of biology can widely renew the discipline but meets epistemological difficulties which must be overcome.
Collapse