Abstract
BackgroundThe dispensable genome of a species, consisting of the dispensable sequences present only in a subset of individuals, is believed to play important roles in phenotypic variation and genome evolution. However, construction of the dispensable genome is costly and labor-intensive at present, and so the influence of the dispensable genome in genetic and functional genomic studies has not been fully explored.ResultsWe construct the dispensable genome of rice through a metagenome-like de novo assembly strategy based on low-coverage (1–3×) sequencing data of 1483 cultivated rice (Oryza sativa L.) accessions. Thousands of protein-coding genes are successfully assembled, including most of the known agronomically important genes absent from the Nipponbare rice reference genome. We develop an integration approach based on alignment and linkage disequilibrium, which is able to assign genomic positions relative to the reference genome for more than 78.2 % of the dispensable sequences. We carry out association mapping studies for rice grain width and 840 metabolic traits using 0.46 million polymorphisms between the dispensable sequences of different rice accessions. About 23.5 % of metabolic traits have more significant association signals with polymorphisms from dispensable sequences than with SNPs from the reference genome, and 41.6 % of trait-associated SNPs have concordant genomic locations with associated dispensable sequences.ConclusionsOur results suggest the feasibility of building a species’ dispensable genome using low-coverage population sequencing data. The constructed sequences will be helpful for understanding the rice dispensable genome and are complementary to the reference genome for identifying candidate genes associated with phenotypic variation.Electronic supplementary materialThe online version of this article (doi:10.1186/s13059-015-0757-3) contains supplementary material, which is available to authorized users.
Highlights
The dispensable genome of a species, consisting of the dispensable sequences present only in a subset of individuals, is believed to play important roles in phenotypic variation and genome evolution
We further demonstrate that these sequences would be helpful for understanding the rice dispensable genome and picking candidate genes in quantitative trait locus (QTL) mapping and genome-wide association study (GWAS) in rice
Collecting sequence data and assembling the dispensable genome using a metagenome-like assembly strategy We collected data of 533 rice accessions sequenced at ~2.5× coverage and 950 rice accessions sequenced at ~1× coverage [8, 9, 17,18,19]
Summary
The dispensable genome of a species, consisting of the dispensable sequences present only in a subset of individuals, is believed to play important roles in phenotypic variation and genome evolution. With the application of next-generation sequencing, huge amounts of low-coverage population sequencing data have been generated [8, 9]. A large portion of individualand subpopulation-specific sequences were left out and present studies have not taken full advantage of the huge amount of population sequencing data. A lot of genes controlling important traits have been found to be absent from the Nipponbare reference genome [11], such as GW5 [13], Sub1A [14], and Pikm-1 [15]. This indicates that one genome is insufficient [2]. More genome sequences are needed to gain a more comprehensive understanding of the pan-genome of rice
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.