Abstract

BackgroundIdentifying the transcription start sites (TSS) of genes is essential for characterizing promoter regions. Several protocols have been developed to capture the 5′ end of transcripts via Cap Analysis of Gene Expression (CAGE) or linker-ligation strategies such as Paired-End Analysis of Transcription Start Sites (PEAT), but often require large amounts of tissue. More recently, nanoCAGE was developed for sequencing on the Illumina GAIIx to overcome these difficulties.ResultsHere we present the first publicly available adaptation of nanoCAGE for sequencing on recent ultra-high throughput platforms such as Illumina HiSeq-2000, and CapFilter, a computational pipeline that greatly increases confidence in TSS identification. We report excellent gene coverage, reproducibility, and precision in transcription start site discovery for samples from Arabidopsis thaliana roots.ConclusionnanoCAGE-XL together with CapFilter allows for genome wide identification of high confidence transcription start sites in large eukaryotic genomes.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-015-1670-6) contains supplementary material, which is available to authorized users.

Highlights

  • Identifying the transcription start sites (TSS) of genes is essential for characterizing promoter regions

  • Linker sequence and template input in nanoCAGE introduce important trade-offs in sequencing outcome To assess the viability of the nanoCAGE protocol with different sequencing strategies, we prepared a total of 10 libraries over three separate experiments using both single and barcoded library formats, with or without a linker – a short six-nucleotide sequence introduced in the template switching (TS) oligo to normalize barcode biases in library preparations [9] – for sequencing on the HiSeq-2000 platform (Table 1)

  • Replicates are essential for the assessment of method repeatability, our goal with this first experiment was to determine a baseline outcome from the maximal profiling of an individual Arabidopsis library

Read more

Summary

Introduction

Identifying the transcription start sites (TSS) of genes is essential for characterizing promoter regions. Several protocols have been developed to capture the 5′ end of transcripts via Cap Analysis of Gene Expression (CAGE) or linker-ligation strategies such as Paired-End Analysis of Transcription Start Sites (PEAT), but often require large amounts of tissue. The Cap Analysis of Gene Expression (CAGE) method [5] actively “traps” the 5′ N7-Methylguanosinetriphosphate (7mG-p-p-p-N) modification common to all pol-II generated transcripts, known as the “cap”, with streptavidin beads. Both PEAT and CAGE have been widely employed in animal studies [4, 6, 7], and recently we have successfully applied the PEAT strategy to plant tissues [3]. The nanoCAGE protocol aims to reduce the required amount of total RNA from the 50 to 150 μg necessary with PEAT and CAGE to the level of nanograms by using a combination of template switching and semi-suppressive PCR, and has been reported to have a level of sensitivity 1000 times higher than that of CAGE [8, 9]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call