Abstract Exobiology, the study of the origin, evolution and distribution of life (including life on earth) within the context of cosmic evolution, is being given a remarkable boost by genome sequencing projects, which are now making the evolutionary histories of protein families routinely available. These histories comprise a multiple alignment for their protein sequences and the corresponding DNA sequences, an evolutionary tree showing the pedigree of these sequences, and reconstructed ancestral sequences for each node in the tree. In a post-genomic world having genomic sequences from an unlimited number of organisms, these histories will be used to connect structure, chemical reactivity, and physiological function to these families. This paper describes several “post-genomic” tools that exploit these evolutionary histories. They can be used to confirm or deny long distance homology between two protein families, identify proteins within a family that have new functions, and identify specific in vitro properties of the protein that are important for its physiological role. Evolution-based data structures for organizing large sequence databases are also described.
Read full abstract