Abstract
As a greater number and diversity of high-quality vertebrate reference genomes become available, it is increasingly feasible to use these references to guide new draft assemblies for related species. Reference-guided assembly approaches may substantially increase the contiguity and completeness of a new genome using only low levels of genome coverage that might otherwise be insufficient for de novo genome assembly. We used low-coverage (∼3.5–5.5x) Illumina paired-end sequencing to assemble draft genomes of two bird species (the Gunnison Sage-Grouse, Centrocercus minimus, and the Clark's Nutcracker, Nucifraga columbiana). We used these data to estimate de novo genome assemblies and reference-guided assemblies, and compared the information content and completeness of these assemblies by comparing CEGMA gene set representation, repeat element content, simple sequence repeat content, and GC isochore structure among assemblies. Our results demonstrate that even lower-coverage genome sequencing projects are capable of producing informative and useful genomic resources, particularly through the use of reference-guided assemblies.
Highlights
High quality sequencing, assembly, and annotation of vertebrate genomes have become feasible for non-traditional model species, as costs of sequencing decrease and analysis methods improve
Our results demonstrate that substantial information can be extracted from lower-coverage genomic sampling projects, and that reference-guided assemblies provide much better representation of biologically important regions than de novo assemblies when genome coverage is low
We were surprised that referenceguided assembly approach was quite successful despite substantial divergence between target species and reference genome species (,40–60 mya; Fig. 9), and with fairly low levels of sequencing coverage (Table 1)
Summary
Annotation of vertebrate genomes have become feasible for non-traditional model species, as costs of sequencing decrease and analysis methods improve. The default method for generating initial genome assemblies for a species includes the use of de novo assembly algorithms that rely on sufficient overlap between sequencing reads to build larger contiguous sequences. This approach is fundamentally different from a reference-guided approach that utilizes existing contiguous sequences and sequence similarity between the target and reference species’ genomes to assemble a genome. The availability of high quality reference genomes for a greater diversity of vertebrate species may enable inexpensive yet informative genomic resources to be generated for new species by leveraging information from existing high-quality genomes of related species. We may potentially achieve greater representation and understanding of genomic diversity across the tree of life through the use of high-quality genomes, complemented by the addition of lower-coverage genomes
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.