Abstract

High-throughput sequencing has revolutionised comparative and evolutionary genome biology. It has now become relatively commonplace to generate multiple genomes and/or transcriptomes to characterize the evolution of large taxonomic groups of interest. Nevertheless, such efforts may be unsuited to some research questions or remain beyond the scope of some research groups. Here we show that targeted high-throughput sequencing offers a viable alternative to study genome evolution across a vertebrate family of great scientific interest. Specifically, we exploited sequence capture and Illumina sequencing to characterize the evolution of key components from the insulin-like growth (IGF) signalling axis of salmonid fish at unprecedented phylogenetic resolution. The IGF axis represents a central governor of vertebrate growth and its core components were expanded by whole genome duplication in the salmonid ancestor ~95Ma. Using RNA baits synthesised to genes encoding the complete family of IGF binding proteins (IGFBP) and an IGF hormone (IGF2), we captured, sequenced and assembled orthologous and paralogous exons from species representing all ten salmonid genera. This approach generated 299 novel sequences, most as complete or near-complete protein-coding sequences. Phylogenetic analyses confirmed congruent evolutionary histories for all nineteen recognized salmonid IGFBP family members and identified novel salmonid-specific IGF2 paralogues. Moreover, we reconstructed the evolution of duplicated IGF axis paralogues across a replete salmonid phylogeny, revealing complex historic selection regimes - both ancestral to salmonids and lineage-restricted - that frequently involved asymmetric paralogue divergence under positive and/or relaxed purifying selection. Our findings add to an emerging literature highlighting diverse applications for targeted sequencing in comparative-evolutionary genomics. We also set out a viable approach to obtain large sets of nuclear genes for any member of the salmonid family, which should enable insights into the evolutionary role of whole genome duplication before additional nuclear genome sequences become available.

Highlights

  • During the last decade, large-scale sequencing projects have become commonplace, allowing the genomes and transcriptomes of vast numbers of species to be analysed

  • Using BLAST, we screened each assembly for the captured sequence data, which typically represented single exons of the target genes, along with flanking gDNA from introns

  • Our final study aim was to demonstrate the utility of sequence capture for comparative evolutionary analyses encompassing the salmonid lineage – we explored the molecular evolution of IGF binding proteins (IGFBP) family members retained as two salmonid-specific WGD (ssWGD) paralogues that began diverging in the common salmonid ancestor, along with the novel salmonid IGF2-B paralogues (Fig. 6)

Read more

Summary

Introduction

Large-scale sequencing projects have become commonplace, allowing the genomes and transcriptomes of vast numbers of species to be analysed. Wickett et al, 2014; Zhang et al, 2014; Jarvis et al, 2014) While such projects generate extensive high-quality sequence data at a relatively low cost, they require sizeable investment in expert person time and infrastructure necessary to achieve their bioinformatic goals (see Wetterstrand, 2015). As a cost-effective, bioinformatically less-demanding alternative, targeted capture/enrichment and sequencing of pre-selected genomic regions offers a proven approach for researchers working on both model and non-model organisms. PCR is used to analyse a small number of genes in combination with the Sanger method, or more recently, with second-generation high-throughput sequencing (Tewhey et al, 2009; reviewed in Metzker, 2010). An alternative approach has been to exploit custom-designed microarrays or solutionbased hybridization platforms to enrich for sequences (i.e. sequence capture) prior to second-generation sequencing (e.g. Okou et al, 2007; Gnirke et al, 2009; Turner et al, 2009)

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.