Abstract
BackgroundWhole genome sequences (WGS) have proliferated as sequencing technology continues to improve and costs decline. While many WGS of model or domestic organisms have been produced, a growing number of non-model species are also being sequenced. In the absence of a reference, construction of a genome sequence necessitates de novo assembly which may be beyond the ability of many labs due to the large volumes of raw sequence data and extensive bioinformatics required. In contrast, the presence of a reference WGS allows for alignment which is more tractable than assembly. Recent work has highlighted that the reference need not come from the same species, potentially enabling a wide array of species WGS to be constructed using cross-species alignment. Here we report on the creation a draft WGS from a single bighorn sheep (Ovis canadensis) using alignment to the closely related domestic sheep (Ovis aries).ResultsTwo sequencing libraries on SOLiD platforms yielded over 865 million reads, and combined alignment to the domestic sheep reference resulted in a nearly complete sequence (95% coverage of the reference) at an average of 12x read depth (104 SD). From this we discovered over 15 million variants and annotated them relative to the domestic sheep reference. We then conducted an enrichment analysis of those SNPs showing fixed differences between the reference and sequenced individual and found significant differences in a number of gene ontology (GO) terms, including those associated with reproduction, muscle properties, and bone deposition.ConclusionOur results demonstrate that cross-species alignment enables the creation of novel WGS for non-model organisms. The bighorn sheep WGS will provide a resource for future resequencing studies or comparative genomics.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-015-1618-x) contains supplementary material, which is available to authorized users.
Highlights
Whole genome sequences (WGS) have proliferated as sequencing technology continues to improve and costs decline
Filtering and alignment were conducted on both libraries in CLC Genomics Workbench
When aligned on its own the mate-paired library had 174,894,731 reads map to the reference, of which 115,727,618 were in pairs with an average distance of 1108 nucleotides between pairs, while the fragment library had 377,008,050 reads map to the reference
Summary
Whole genome sequences (WGS) have proliferated as sequencing technology continues to improve and costs decline. Widespread use of high-throughput sequencers has allowed an ever increasing number of species to have a whole genome sequence (WGS) prepared. While many of these have been model or domestic organisms, a wide array of taxa continue to be sequenced (as reviewed in [1]). In the absence of a reference, construction of a WGS necessitates de novo methodologies [15] These methods require large volumes of raw sequence data which are arranged into contigs and joined to scaffolds by either computational methods [16], anchoring with outside information (e.g. a linkage map, BACs, or FISH), or continued sequencing [17].
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.