Abstract

BackgroundSweet potato (Ipomoea batatas (L.) Lam.) is one of the most important crops in many developing countries and provides a candidate source of bioenergy. However, neither a complete reference genome nor large-scale full-length cDNA sequences for this outcrossing hexaploid crop are available, which in turn impedes progress in research studies in I. batatas functional genomics and molecular breeding.MethodsIn this study, we sequenced full-length transcriptomes in I. batatas and its diploid ancestor I. trifida by single-molecule real-time sequencing and Illumina second-generation sequencing technologies. With the generated datasets, we conducted comprehensive intraspecific and interspecific sequence analyses and experimental characterization.ResultsA total of 53,861/51,184 high-quality long-read transcripts were obtained, which covered about 10,439/10,452 loci in the I. batatas/I. trifida genome. These datasets enabled us to predict open reading frames successfully in 96.83%/96.82% of transcripts and identify 34,963/33,637 full-length cDNA sequences, 1,401/1,457 transcription factors, 25,315/27,090 simple sequence repeats, 1,656/1,389 long non-coding RNAs, and 5,251/8,901 alternative splicing events. Approximately, 32.34%/38.54% of transcripts and 46.22%/51.18% multi-exon transcripts underwent alternative splicing in I. batatas/I. trifida. Moreover, we validated one alternative splicing event in each of 10 genes and identified tuberous-root-specific expressed isoforms from a starch-branching enzyme, an alpha-glucan phosphorylase, a neutral invertase, and several ABC transporters. Overall, the collection and analysis of large-scale long-read transcripts generated in this study will serve as a valuable resource for the I. batatas research community, which may accelerate the progress in its structural, functional, and comparative genomics studies.

Highlights

  • Sweet potato (Ipomoea batatas (L.) Lam.) is the seventh most important crop in the world and it ensures food supply and safety in many developing countries

  • CDNA libraries were prepared from the same samples that were used for SMRT sequencing, and deep RNA sequencing was conducted using an Illumina Hiseq2500 platform

  • A total of 71,360,785 and 39,372,131 clean reads were obtained and used to correct the SMRT reads in I. batatas and I. trifida, respectively (Table 1)

Read more

Summary

Introduction

Sweet potato (Ipomoea batatas (L.) Lam.) is the seventh most important crop in the world and it ensures food supply and safety in many developing countries. The collection and analysis of large-scale full-length cDNA sequences have not been done in I. batatas, which is fundamental to its structural and functional genomics studies. Neither a complete reference genome nor large-scale full-length cDNA sequences for this outcrossing hexaploid crop are available, which in turn impedes progress in research studies in I. batatas functional genomics and molecular breeding. A total of 53,861/51,184 high-quality long-read transcripts were obtained, which covered about 10,439/10,452 loci in the I. batatas/I. trifida genome These datasets enabled us to predict open reading frames successfully in 96.83%/96.82% of transcripts and identify 34,963/33,637 full-length cDNA sequences, 1,401/1,457 transcription factors, 25,315/27,090 simple sequence repeats, 1,656/1,389 long non-coding RNAs, and 5,251/8,901 alternative splicing events. The collection and analysis of large-scale long-read transcripts generated in this study will serve as a valuable resource for the I. batatas research community, which may accelerate the progress in its structural, functional, and comparative genomics studies

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call