Abstract
A one-on-one mapping of protein functionality across different species is a critical component of comparative analysis. This paper presents a heuristic algorithm for discovering the Most Likely Functional Counterparts (MoLFunCs) of a protein, based on simple concepts from network theory. A key feature of our algorithm is utilization of the user's knowledge to assign high confidence to selected functional identification. We show use of the algorithm to retrieve functional equivalents for 7 membrane proteins, from an exploration of almost 40 genomes form multiple online resources. We verify the functional equivalency of our dataset through a series of tests that include sequence, structure and function comparisons. Comparison is made to the OMA methodology, which also identifies one-on-one mapping between proteins from different species. Based on that comparison, we believe that incorporation of user's knowledge as a key aspect of the technique adds value to purely statistical formal methods.
Highlights
The current spate of genome sequencing projects [1] has resulted in large amounts of sequence information from all kingdoms of life
In case we find a hit that is not already included and no other most likely functional counterpart (MoLFunC) sequence is within 10 bits, we exclude the species from the MoLFunC matrix
It is important that all council members are able to work with each other as well as with the chief authority that initiates the selection process
Summary
The current spate of genome sequencing projects [1] has resulted in large amounts of sequence information from all kingdoms of life. Comparative genomic analysis is being increasingly employed for functional annotation. The basis of most comparative techniques is the notion of homology or common evolutionary origin of the gene/protein sets being investigated. The multiplicity of evolutionary scenarios necessitates a more fine-grained description of homology in terms of orthologs, inparalogs and out-paralogs [2]. Orthologs are genes from different species that have a common ancestor. Orthologous genes from different species were thought of as having similar functions. Gene duplication can result in functional divergence within a species and give rise to paralogs. Depending on the degree of divergence, paralogs can retain a significant portion of the sequence features of the original gene. Since duplication of a gene can still satisfy the constraint of common ancestor with genes from other species, multiple pairs of orthologous genes in two species can have arisen from a single ancestor prior to the duplication
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.