Abstract

BackgroundWith the wealth of genomic data available it has become increasingly important to assign putative protein function through functional transfer between orthologs. Therefore, correct elucidation of the evolutionary relationships among genes is a critical task, and attempts should be made to further improve the phylogenetic inference by adding relevant discriminating features. It has been shown that introns can maintain their position over long evolutionary timescales. For this reason, it could be possible to use conservation of intron positions as a discriminating factor when assigning orthology. Therefore, we wanted to investigate whether orthologs have a higher degree of intron position conservation (IPC) compared to non-orthologous sequences that are equally similar in sequence.ResultsTo this end, we developed a new score for IPC and applied it to ortholog groups between human and six other species. For comparison, we also gathered the closest non-orthologs, meaning sequences close in sequence space, yet falling just outside the ortholog cluster. We found that ortholog-ortholog gene pairs on average have a significantly higher degree of IPC compared to ortholog-closest non-ortholog pairs. Also pairs of inparalogs were found to have a higher IPC score than inparalog-closest non-inparalog pairs. We verified that these differences can not simply be attributed to the generally higher sequence identity of the ortholog-ortholog and the inparalog-inparalog pairs.Furthermore, we analyzed the agreement between IPC score and the ortholog score assigned by the InParanoid algorithm, and found that it was consistently high for all species comparisons. In a minority of cases, the IPC and InParanoid score ranked inparalogs differently. These represent cases where sequence and intron position divergence are discordant. We further analyzed the discordant clusters to identify any possible preference for protein functions by looking for enriched GO terms and Pfam protein domains. They were enriched for functions important for multicellularity, which implies a connection between shifts in intronic structure and the origin of multicellularity.ConclusionsWe conclude that orthologous genes tend to have more conserved intron positions compared to non-orthologous genes. As a consequence, our IPC score is useful as an additional discriminating factor when assigning orthology.

Highlights

  • With the wealth of genomic data available it has become increasingly important to assign putative protein function through functional transfer between orthologs

  • Since sequences that are evolutionarily conserved tend to have a higher sequence identity compared to non-related sequences, we examined the possible dependence between intron position conservation (IPC) and sequence identity

  • Function term enrichment analysis and IPC-orthology disagreement Is conservation of intron position, or the lack thereof, associated with some specific classes of proteins, such as those belonging to certain pathways or cellular roles? To answer this question, we evaluated whether or not the distribution of Gene Ontology [35] terms was the same for proteins where IPC and evolutionary distance were in agreement and proteins where they disagreed

Read more

Summary

Introduction

With the wealth of genomic data available it has become increasingly important to assign putative protein function through functional transfer between orthologs. It has been shown that introns can maintain their position over long evolutionary timescales For this reason, it could be possible to use conservation of intron positions as a discriminating factor when assigning orthology. With the wealth of genomes available, automatic methods for identifying evolutionary relationships between genes becomes important when transferring functions from already annotated genes to unannotated. If the duplication occurred after the speciation event, the genes are considered to be inparalogs, meaning that they are co-orthologs to one or several genes in another species. If the duplication event happened prior to the speciation event, the sequences are outparalogs and as such do not form any co-ortholog relationship with genes in another genome. Outparalogs cannot be used to transfer functional assignments between species

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.