Abstract
BackgroundUnderstanding the origins of genome content has long been a goal of molecular evolution and comparative genomics. By examining genome evolution through the guise of lineage-specific evolution, it is possible to make inferences about the evolutionary events that have given rise to species-specific diversification. Here we characterize the evolutionary trends found in chordate species using The Adaptive Evolution Database (TAED). TAED is a database of phylogenetically indexed gene families designed to detect episodes of directional or diversifying selection across chordates. Gene families within the database have been assessed for lineage-specific estimates of dN/dS and have been reconciled to the chordate species to identify retained duplicates. Gene families have also been mapped to the functional pathways and amino acid changes which occurred on high dN/dS lineages have been mapped to protein structures.ResultsAn analysis of this exhaustive database has enabled a characterization of the processes of lineage-specific diversification in chordates. A pathway level enrichment analysis of TAED determined that pathways most commonly found to have elevated rates of evolution included those involved in metabolism, immunity, and cell signaling. An analysis of protein fold presence on proteins, after normalizing for frequency in the database, found common folds such as Rossmann folds, Jelly Roll folds, and TIM barrels were overrepresented on proteins most likely to undergo directional selection. A set of gene families which experience increased numbers of duplications within short evolutionary times are associated with pathways involved in metabolism, olfactory reception, and signaling. An analysis of protein secondary structure indicated more relaxed constraint in β-sheets and stronger constraint on alpha Helices, amidst a general preference for substitutions at exposed sites. Lastly a detailed analysis of the ornithine decarboxylase gene family, a key enzyme in the pathway for polyamine synthesis, revealed lineage-specific evolution along the lineage leading to Cetacea through rapid sequence evolution in a duplicate gene with amino acid substitutions causing active site rearrangement.ConclusionEpisodes of lineage-specific evolution are frequent throughout chordate species. Both duplication and directional selection have played large roles in the evolution of the phylum. TAED is a powerful tool for facilitating this understanding of lineage-specific evolution.
Highlights
Understanding the origins of genome content has long been a goal of molecular evolution and comparative genomics
All pairwise alignments within each multiple sequence alignment had no more than 10% gaps and were at least 80% identical in non-gapped positions. This created a trade-off between gene family ages and alignment quality, homologous gene family relationships can still be identified through The Adaptive Evolution Database (TAED)
Twentythree thousand nine hundred seventy gene families contained one or more branches with dN/dS > 1, indicating positive or directional selection acting on these lineages
Summary
Understanding the origins of genome content has long been a goal of molecular evolution and comparative genomics. As the isolated populations become separate species, mutations of different magnitudes affect the protein coding repertoire of the two diverging genomes These changes include synonymous changes that only affect the nucleotide sites, nonsynonymous changes that affect the amino acid sites, and gene duplication and loss events, among other types of changes. All pairwise alignments within each multiple sequence alignment had no more than 10% gaps and were at least 80% identical in non-gapped positions This created a trade-off between gene family ages (many had root nodes younger than the last common ancestor of chordates) and alignment quality, homologous gene family relationships can still be identified through TAED. Gene families have been reconciled to the NCBI taxonomy [67] as a reference species tree and events of positive directional and diversifying selection detected using nonsynonymous to synonymous nucleotide substitution rate ratios in the branches model averaged across sites [83]. Gene families have been used to identify duplication events using the SoftParsMap parsimonybased gene tree-species tree reconciliation software [9]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.