Abstract

The rich genetic diversity in Oryza sativa and Oryza rufipogon serves as the main sources in rice breeding. Large-scale resequencing has been undertaken to discover allelic variants in rice, but much of the information for genetic variation is often lost by direct mapping of short sequence reads onto the O. sativa japonica Nipponbare reference genome. Here we constructed a pan-genome dataset of the O. sativa–O. rufipogon species complex through deep sequencing and de novo assembly of 66 divergent accessions. Intergenomic comparisons identified 23 million sequence variants in the rice genome. This catalog of sequence variations includes many known quantitative trait nucleotides and will be helpful in pinpointing new causal variants that underlie complex traits. In particular, we systemically investigated the whole set of coding genes using this pan-genome data, which revealed extensive presence and absence of variation among rice accessions. This pan-genome resource will further promote evolutionary and functional studies in rice.

Highlights

  • The rich genetic diversity in Oryza sativa and Oryza rufipogon serves as the main sources in rice breeding

  • We identified a total of 16,563,789 SNPs, 5,549,290 small insertions and deletions of ≤2​ 0 bp and 933,489 structural variants (SVs; which refer to large indels that range from 20 bp to 12 kb in this work)

  • We have generated a pan-genome dataset for the O. sativa–O. rufipogon species complex, a resource for in-depth functional genomics studies and molecular breeding that should be useful in future

Read more

Summary

Introduction

The rich genetic diversity in Oryza sativa and Oryza rufipogon serves as the main sources in rice breeding. With the application of high-throughput sequencing technologies, diverse rice accessions have been resequenced and phenotyped during recent years, with the aim of exploring genomic diversity to look for the gene loci under human selection and to uncover the molecular basis of many agronomic traits[2,3,4,5,6,7,8,9] In these resequencing efforts, characterizations of the genetic variants all rely on high levels of sequence similarity to map the short reads (typically, ~100 bp) onto the rice reference genome[10], which means that the information from highly polymorphic regions would often be inevitably lost. The establishment of a rice pan-genome will be helpful in utilizing the various alleles within the gene pools for genetic studies and breeding

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.