Abstract

Allotetraploid cotton species (Gossypium hirsutum and Gossypium barbadense) have long been cultivated worldwide for natural renewable textile fibers. The draft genome sequences of both species are available but they are highly fragmented and incomplete1–4. Here we report reference-grade genome assemblies and annotations for G. hirsutum accession Texas Marker-1 (TM-1) and G. barbadense accession 3–79 by integrating single-molecule real-time sequencing, BioNano optical mapping and high-throughput chromosome conformation capture techniques. Compared with previous assembled draft genomes1,3, these genome sequences show considerable improvements in contiguity and completeness for regions with high content of repeats such as centromeres. Comparative genomics analyses identify extensive structural variations that probably occurred after polyploidization, highlighted by large paracentric/pericentric inversions in 14 chromosomes. We constructed an introgression line population to introduce favorable chromosome segments from G. barbadense to G. hirsutum, allowing us to identify 13 quantitative trait loci associated with superior fiber quality. These resources will accelerate evolutionary and functional genomic studies in cotton and inform future breeding programs for fiber improvement.

Highlights

  • Cotton represents the largest source of natural textile fibers in the world

  • Over 90% of annual fiber production comes from allotetraploid cotton (G. hirsutum and G. barbadense), which originated from an allopolyplodization event approximately 1–2 million year ago, followed by millennia of asymmetric subgenome selection[5,6]

  • We identified 9,135 segments in G. hirsutum with a total length of 179.9 Mb that are absent in G. barbadense and 7,710 segments in G. barbadense with a total length of 139.8 Mb that are absent in G. hirsutum (Fig. 2b; Supplementary Tables 23 and 24)

Read more

Summary

Introduction

Cotton represents the largest source of natural textile fibers in the world. Over 90% of annual fiber production comes from allotetraploid cotton (G. hirsutum and G. barbadense), which originated from an allopolyplodization event approximately 1–2 million year ago, followed by millennia of asymmetric subgenome selection[5,6]. The final assemblies include 2,190 scaffolds for G. hirsutum and 3,032 for G. barbadense, of which the largest 26 super-scaffolds representing all chromosomes occupied 98.94% and 97.68% of all sequences respectively Compared with previously published draft genomes[1,3], these sequences represent a significant improvement in contiguity (55-fold against G. hirsutum and 90-fold against G. barbadense) (Supplementary Table 10).

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call