Abstract

The complexity of the tomato (Solanum lycopersicum) transcriptome has not yet been fully elucidated. To gain insights into the diversity and features of coding and non-coding RNA molecules of tomato fruits, we generated strand-specific libraries from berries of two tomato cultivars grown in two open-field conditions with different soil type. Following high-throughput Illumina RNA-sequencing (RNA-seq), more than 90% of the reads (over one billion, derived from twelve dataset) were aligned to the tomato reference genome. We report a comprehensive analysis of the transcriptome, improved with 39,095 transcripts, which reveals previously unannotated novel transcripts, natural antisense transcripts, long non-coding RNAs and alternative splicing variants. In addition, we investigated the sequence variants between the cultivars under investigation to highlight their genetic difference. Our strand-specific analysis allowed us to expand the current tomato transcriptome annotation and it is the first to reveal the complexity of the poly-adenylated RNA world in tomato. Moreover, our work demonstrates the usefulness of strand specific RNA-seq approach for the transcriptome-based genome annotation and provides a resource valuable for further functional studies.

Highlights

  • IntroductionThe power and speed of Next Generation Sequencing (NGS) allow to generate high-quality genome sequences [1, 2], compare genomes across multiple samples [3, 4], map structural variations [5,6,7] and identify polymorphisms [8, 9]

  • Generation Sequencing (NGS) technologies are having an important impact on genomic research because they can be employed to address questions unapproachable with earlier tools

  • Starting from a genome-guided assembly, we report a comprehensive analysis of the tomato transcriptome, improved with a collection of 39,095 transcripts that include splicing variants, transcripts overlapping annotated loci, natural antisense transcripts and transcripts completely absent from the official tomato annotation

Read more

Summary

Introduction

The power and speed of NGS allow to generate high-quality genome sequences [1, 2], compare genomes across multiple samples [3, 4], map structural variations [5,6,7] and identify polymorphisms [8, 9]. High-throughput sequencing-based approaches of the RNA world are used to assemble, improve or extend the transcriptome of an organism, either by reference-based or de novo strategies [12,13,14,15,16]. They allow a comprehensive discovery of novel genes and transcripts at PLOS ONE | DOI:10.1371/journal.pone.0171504. They allow a comprehensive discovery of novel genes and transcripts at PLOS ONE | DOI:10.1371/journal.pone.0171504 February 10, 2017

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call