I’m sick to death of people claiming ridiculous amounts of information in genomes. Pundits with an axe to grind against materialism like to liken the information in a simple cell to the Encyclopaedia Britannica – all of it. But we’ve actually measured the information in microscopic replicating biosystems – viruses, archaea and bacteria – so we have some guide to what’s needed. Biosystems are made of both simple and incredibly complex molecules – and the complex ones, proteins, are encoded as a sequence of DNA. In microbes the DNA is one huge loop – a ring – which tiny molecular machines called ribosomes read and convert the information into proteins. The information itself is the order of amino acids – twenty specific ones in most lifeforms – which make up the protein itself. Once the amino acids are put together into a string connected by chemical bonds the newly made protein then folds up into a shape that lets it do the specific chemistry task that it controls. For many proteins a large fraction of the amino acid sequence can be changed with no change in function or form – most of the action happens in a few small regions. Some protein machines are ubiquitous in function throughout the biological world – cytochrome c for example – but exist in a HUGE variety, with completely different sequences of amino acids.
So how much information will let a biosystem self-replicate? Viruses don’t, though some contain more DNA than the simplest bacteria. But Viruses do self-assemble after their proteins have been made by a host-cell’s ribosomes. In fact many intricate protein machines – which viruses are just one example of – self-assemble from their component proteins, without any apparent molecular “master-builder”. Instead as the proteins jostle around inside their host cell, their specific magnetic linkages will find each other and link up. Brief interactions between the proteins and other unrelated proteins might occur, but in the constant jostling only a proper fit will stick the two together fastly. It’s crowded and busy inside even the simplest cells.
The simplest self-replicating biosystem known is an intra-cellular parasite called Nanoarchaeum equitans a bacterial parasite with a DNA string about 490,885 base-pairs long. A base-pair is the minimal unit of DNA information, which can have four different values (equivalent to 2 bits of computer-style information.) There’s 3 bases per codon in DNA’s “language” so the Nanoarchaeum genome is about 163,000 codons long. A codon is roughly 6 bits. Thus the simplest self-replicator is the equivalent of about 122 kilobytes (1 byte = 8 bits.) There’s 10,000 symbols (including spaces) per page of the Encyclopaedia Britannica – I counted it out of curiosity one day. Each symbol of print is roughly a byte. Thus Nanoarchaeum needs just 12 pages of Britannica to encode its genes.
Taking into account the redundancy of the DNA codon code and the roughly 50%-80% redundancy of amino acid sequences themselves, that means roughly 34 – 13.5 kilobytes of information will code a self-replicating DNA-based cell. Just 3 to 1 page of Britannica.
That’s still a lot of information to “just happen”, but in our ignorance of proto-biochemistry we might be missing the key element that simplifies matters even further.
Weirder things might be needed. Physicist Paul Davies has speculated that backwards causation might cause the past to be at least partially determined by the future – thus biochemistry was arranged to be consistent with the existence of Life by the (future) observation that Life exists. Else there would be no observation for that “consistent history” to ever happen. This occurred not by design, as in engineering by an external god, but by an inner mathematical consistency that insists the Universe is observed and thus observers should exist.
The self-referential nature of that idea gives me a headache, but check out Davies “The Goldilocks Enigma” (called “Cosmic Jackpot” in the USA) for a fuller discussion. There’s a whole barrel of mysteries as to how proteins do what they do and we might find some pretty wild quantum effects are necessary for life itself.
Ask yourself: if you were a god how would you do it? Can Life be designed?