Abstract
BackgroundAs orthologous proteins are expected to retain function more often than other homologs, they are often used for functional annotation transfer between species. However, ortholog identification methods do not take into account changes in domain architecture, which are likely to modify a protein's function. By domain architecture we refer to the sequential arrangement of domains along a protein sequence.To assess the level of domain architecture conservation among orthologs, we carried out a large-scale study of such events between human and 40 other species spanning the entire evolutionary range. We designed a score to measure domain architecture similarity and used it to analyze differences in domain architecture conservation between orthologs and paralogs relative to the conservation of primary sequence. We also statistically characterized the extents of different types of domain swapping events across pairs of orthologs and paralogs.ResultsThe analysis shows that orthologs exhibit greater domain architecture conservation than paralogous homologs, even when differences in average sequence divergence are compensated for, for homologs that have diverged beyond a certain threshold. We interpret this as an indication of a stronger selective pressure on orthologs than paralogs to retain the domain architecture required for the proteins to perform a specific function. In general, orthologs as well as the closest paralogous homologs have very similar domain architectures, even at large evolutionary separation.The most common domain architecture changes observed in both ortholog and paralog pairs involved insertion/deletion of new domains, while domain shuffling and segment duplication/deletion were very infrequent.ConclusionsOn the whole, our results support the hypothesis that function conservation between orthologs demands higher domain architecture conservation than other types of homologs, relative to primary sequence conservation. This supports the notion that orthologs are functionally more similar than other types of homologs at the same evolutionary distance.
Highlights
As orthologous proteins are expected to retain function more often than other homologs, they are often used for functional annotation transfer between species
Proteomes for the bacteria Mycobacterium tuberculosis and Mycobacterium leprae as well as for the archaea Aeropyrum pernix, Methanococcus acetivorans, Pyrobaculum aerophilum and Sulfolobus acidocaldarius were downloaded from the COGENT database [23]
Characterizing protein pair types We computed the levels of domain architecture and primary sequence conservation for pairs of orthologous proteins, and compared these with corresponding figures for paralogous proteins at the same evolutionary divergence
Summary
As orthologous proteins are expected to retain function more often than other homologs, they are often used for functional annotation transfer between species. Homologous genes in different species are defined as orthologs if they descend from a single gene in the last common ancestor [1], and outparalogs, if they diverged via duplication before this ancestor. The more genomes are sequenced, the more important orthology identification becomes This is because orthologs often have the same or closely related functions in the extant species and can be used for the transfer of functional information. Transfer of functional information between orthologs is important for annotation of newly sequenced genomes [4,5]. While there is significant evidence that orthologous proteins generally have similar functions [4,6], the assumption that orthologs are functionally more conserved than other homologs at the same separation has not been systematically evaluated [7]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.