Abstract

Progressive evolution, or the tendency towards increasing complexity, is a controversial issue in biology, which resolution entails a proper measurement of complexity. Genomes are the best entities to address this challenge, as they encode the historical information of a species’ biotic and environmental interactions. As a case study, we have measured genome sequence complexity in the ancient phylum Cyanobacteria. To arrive at an appropriate measure of genome sequence complexity, we have chosen metrics that do not decipher biological functionality but that show strong phylogenetic signal. Using a ridge regression of those metrics against root-to-tip distance, we detected positive trends towards higher complexity in three of them. Lastly, we applied three standard tests to detect if progressive evolution is passive or driven—the minimum, ancestor–descendant, and sub-clade tests. These results provide evidence for driven progressive evolution at the genome-level in the phylum Cyanobacteria.

Highlights

  • Progressive evolution, or the tendency towards increasing complexity, is a controversial issue in biology, which resolution entails a proper measurement of complexity

  • The most well-known are those involved in biological function, such as the typical genome division into coding and non-coding parts or the differential conservation shown by distinct codon positions due to the differential evolutionary constraints acting within g­ enes[38,39,40]

  • We intend to capture or approximate the genome information held in these layers using certain metrics to determine whether they show phylogenetic signals and indicate some kind of evolutionary trend

Read more

Summary

Introduction

Progressive evolution, or the tendency towards increasing complexity, is a controversial issue in biology, which resolution entails a proper measurement of complexity. Genomes probably provide the best record of the biological history of a species Do they enable us to reconstruct their phylogenetic relationships but they contain information gained from their continuous biotic and environmental interactions over ­time[6,7,8]. The first four metrics are based on the Sequence Compositional Complexity (SCC) derived from a four-symbol DNA sequence or the binary sequences resulting from grouping the four nucleotides into S(C,G) versus W(A,T) or R(A,G) versus Y (T,C), or K(A,C) versus M(T,G), obtaining SCCSW, SCCRY and SCCKM metrics, ­respectively[17] These four metrics increase with the number of parts (i.e. compositional domains) as well as the length and compositional differences among them found in a genome sequence by a segmentation algorithm. These metrics parallel the concept of ‘pure complexity’ of ­McShea[18] and McShea and B­ randon[3], where complexity is more closely related to the number of part types of an individual than with the number of functions

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.