Abstract

Genomic diversity within a species genome is the genetic basis of its phenotypic diversity essential for its adaptation to environments. The big picture of the total genetic diversity within Asian cultivated rice has been uncovered since the sequencing of 3,000 rice genomes, including the SNP data publicly available in the SNP-Seek database. Here we report other aspects of the genetic diversity, including rice sequences assembled from over 3,000 accessions but absent in the Nipponbare reference genome, structural variations (SVs) and gene presence/absence variations (PAVs) in 453 accessions with sequencing depth over 20x. Using either SVs or gene PAVs, we were able to reconstruct the population structure of O. sativa, which was consistent with previous result based on SNPs. Moreover, we demonstrated the usefulness of the new data sets by successfully detecting the strong association of the “Green Revolution gene”, sd1, with plant height. Our data provide a more comprehensive view of the genetic diversity within rice, as well as additional genomic resources for research in rice breeding and plant biology.

Highlights

  • Background & SummaryWith the continued reduction of croplands, we are facing a great challenge of feeding the fast growing world population

  • We have previously reported the single nucleotide polymorphisms (SNP) data derived from these sequencing data[3], which is very useful for QTL mapping by genome-wide association studies

  • We present the rice structural variations (SVs) data obtained by calling against the Nipponbare reference genome using novoBreak[5] because it had the lowest false positive rate when compared with results from several tools such as BreakDancer[6] and Delly[7]

Read more

Summary

Background & Summary

With the continued reduction of croplands, we are facing a great challenge of feeding the fast growing world population. Structural variations (SVs) and gene presence/absence variations (PAVs) represent additional dimensions of the total genetic diversity within a species and remain largely unknown in almost all eukaryotes In this descriptor, we reported the SV data and gene PAV data of O. sativa, together with the novel sequences absent in the widely used Nipponbare reference genome IRGSP-1.0, as key results of the in-depth analyses of the sequencing data of the 3k RG4. We present the PAV data sets of 48,098 full-length proteincoding genes (35,633 Nipponbare reference genes and 12,465 novel genes) and 23,876 gene families in the 453 rice accessions, which were obtained from a “map-to-pan” pipeline[8] These data provide a more comprehensive understanding of the genomic diversity within O. sativa, and provide additional genomic markers for genome-wide association studies of rice

Methods
Novel sequences
Data Records
Technical Validation
Author Contributions
Findings
Additional information
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.