Abstract

In transcriptome analysis, accurate annotation of each transcriptional unit and its expression profile is essential. A full-length cDNA (FL-cDNA) collection facilitates the refinement of transcriptional annotation, and accurate transcription start sites help to unravel transcriptional regulation. We constructed a normalized FL-cDNA library from eight growth stages of aerial tissues in Sorghum bicolor and isolated 37,607 clones. These clones were Sanger sequenced from the 5′ and/or 3′ ends and in total 38,981 high-quality expressed sequence tags (ESTs) were obtained. About one-third of the transcripts of known genes were captured as FL-cDNA clone resources. In addition to these, we also annotated 272 novel genes, 323 antisense transcripts and 1,672 candidate isoforms. These clones are available from the RIKEN Bioresource Center. After obtaining accurate annotation of transcriptional units, we performed expression profile analysis. We carried out spikelet-, seed- and stem-specific RNA sequencing (RNA-Seq) analysis and confirmed the expression of 70.6% of the newly identified genes. We also downloaded 23 sorghum RNA-Seq samples that are publicly available and these are shown on a genome browser together with our original FL-cDNA and RNA-Seq data. Using our original and publicly available data, we made an expression profile of each gene and identified the top 20 genes with the most similar expression. In addition, we visualized their relationships in gene co-expression networks. Users can access and compare various transcriptome data from S, bicolor at http://sorghum.riken.jp.

Highlights

  • Sorghum is a highly productive crop, grown for forage, feedstock, fiber and biofuel

  • We constructed a normalized full-length cDNA (FL-cDNA) library of S. bicolor (L.) Moench from eight growth stages including anthesis and seed set (Table 1), and obtained 38,981 high-quality Sanger sequence reads after quality control

  • Users can access predominantly two types of transcriptome data; correct transcription start site (TSS) and structural gene annotation based on approximately 40,000 FL-cDNAs, and expression profiles from RNA sequencing (RNA-Seq) analysis

Read more

Summary

Introduction

Sorghum is a highly productive crop, grown for forage, feedstock, fiber and biofuel. It ranks fifth in global cereal production and shows strong environmental stress tolerance against drought, heat, salinity and flooding (Belton et al 2004). Identifying relevant genes for this stress tolerance and biomass synthesis contributes to improving sorghum traits by genomeguided breeding and facilitates strengthening other crops against various environmental stresses. In 2009, the Sorghum bicolor BTx623 genome was determined as a model species of the Saccharinae and other C4 grasses (Paterson et al 2009). Zea mays is the closest relative whose genome sequence has been completely determined (Schnable et al 2009) and Oryza sativa is a closely related and well-studied species in the same grass family (Sakai et al 2013). We focused on collecting large-scale experimentally validated data sets of transcriptional units, transcription start sites (TSSs) and expression profiles

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call