Abstract

With the rapid rise in availability of high-quality genomes for closely related species, methods for orthology inference that incorporate synteny are increasingly useful. Polyploidy perturbs the 1:1 expected frequencies of orthologs between two species, complicating the identification of orthologs. Here we present a method of ortholog inference, Ploidy-aware Syntenic Orthologous Networks Identified via Collinearity (pSONIC). We demonstrate the utility of pSONIC using four species in the cotton tribe (Gossypieae), including one allopolyploid, and place between 75% and 90% of genes from each species into nearly 32,000 orthologous groups, 97% of which consist of at most singletons or tandemly duplicated genes—58.8% more than comparable methods that do not incorporate synteny. We show that 99% of singleton gene groups follow the expected tree topology and that our ploidy-aware algorithm recovers 97.5% identical groups when compared to splitting the allopolyploid into its two respective subgenomes, treating each as separate “species.”

Highlights

  • The recent explosion in high-quality genome assemblies has increased the opportunity to investigate biological questions using a comparative genomics framework

  • We present a method of ortholog inference, Ploidy-aware Syntenic Orthologous Networks Identified via Collinearity

  • We demonstrate the utility of Ploidy-aware Syntenic Orthologous Networks Identified via Collinearity (pSONIC) using four species in the cotton tribe (Gossypieae), including one allopolyploid, and place between 75-90% of genes from each species into nearly 32,000 orthologous groups, 97% of which consist of at most singletons or tandemly duplicated genes -- 58.8% more than comparable methods that do not incorporate synteny

Read more

Summary

INTRODUCTION

The recent explosion in high-quality genome assemblies has increased the opportunity to investigate biological questions using a comparative genomics framework. Programs have been developed to identify these collinear blocks (e.g. MCScanX (Wang et al 2012) and CoGe (Lyons et al 2008)) but these methods are restricted to pairwise comparisons (MCScanX) or comparisons among three genomes (CoGe), and no method for genome-wide detection of orthologs across multiple species has yet incorporated the powerful evidence of orthology provided by synteny. Orthologous Networks Identified via Collinearity (pSONIC), which uses pairwise collinearity blocks from multiple species inferred via MCScanX, along with a highconfidence set of singleton orthologs identified through OrthoFinder, to curate a genome-wide set of syntenic orthologs. As part of pSONIC’s inference, we developed a ploidy-aware algorithm to identify collinear blocks originating from both speciation and duplication events. To demonstrate the effectiveness of our ploidy-aware algorithm, we show that, unlike OrthoFinder, splitting the tetraploid genome into its respective genomes has little effect on our final set of orthogroups

METHODS
AND DISCUSSION
Literature Cited

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.