Abstract

Biological information is stored in DNA, RNA and protein sequences, which can be understood as genotypes that are translated into phenotypes. The properties of genotype–phenotype (GP) maps have been studied in great detail for RNA secondary structure. These include a highly biased distribution of genotypes per phenotype, negative correlation of genotypic robustness and evolvability, positive correlation of phenotypic robustness and evolvability, shape-space covering, and a roughly logarithmic scaling of phenotypic robustness with phenotypic frequency. More recently similar properties have been discovered in other GP maps, suggesting that they may be fundamental to biological GP maps, in general, rather than specific to the RNA secondary structure map. Here we propose that the above properties arise from the fundamental organization of biological information into ‘constrained' and ‘unconstrained' sequences, in the broadest possible sense. As ‘constrained' we describe sequences that affect the phenotype more immediately, and are therefore more sensitive to mutations, such as, e.g. protein-coding DNA or the stems in RNA secondary structure. ‘Unconstrained' sequences, on the other hand, can mutate more freely without affecting the phenotype, such as, e.g. intronic or intergenic DNA or the loops in RNA secondary structure. To test our hypothesis we consider a highly simplified GP map that has genotypes with ‘coding' and ‘non-coding' parts. We term this the Fibonacci GP map, as it is equivalent to the Fibonacci code in information theory. Despite its simplicity the Fibonacci GP map exhibits all the above properties of much more complex and biologically realistic GP maps. These properties are therefore likely to be fundamental to many biological GP maps.

Highlights

  • Biological evolution is characterized by the inheritance, mutation and translation of biological information

  • It has been demonstrated that the phenotypic robustness in biological genotype– phenotype (GP) maps is much higher than one would expect for randomly distributed phenotypes [17,18,19], and scales roughly logarithmically with frequency [5,7]

  • One of the first empirical GP maps was studied by constructing the genotype networks for the binding site repertoires of 193 transcription factors in yeast and mice [20]

Read more

Summary

Introduction

Biological evolution is characterized by the inheritance, mutation and translation of biological information. All the above observations have been made in RNA secondary structure, but it has recently been established that most of these properties can be found across different GP maps, such as the HP model of protein folding [6,7] (where a biased distribution of genotypes per phenotype has been known for some time to exist [8]) and the Polyomino model of biological self-assembly [7,9,10]. We consider here a simple model with a genotype that is divided into regions that code for a phenotype, and ones that do not, and show that this model gives rise to all the properties observed in the RNA secondary structure GP map and other GP maps, as outlined above This provides a strong argument that the fundamental organization of biological information into a series of constrained and unconstrained read direction stop codon stop codon coding non-coding coding these sequences represent distinct phenotypes non-coding. Sequences has profound effects on the structure of biological GP maps, and on the translation of genotypes into phenotypes, and the course of biological evolution

The Fibonacci GP map
Biased distribution of the number of genotypes per phenotype
Evolvability and robustness
Phenotype coverage
Robustness versus frequency
Discussion and conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call