Abstract

In the FANTOM5 project, transcription initiation events across the human and mouse genomes were mapped at a single base-pair resolution and their frequencies were monitored by CAGE (Cap Analysis of Gene Expression) coupled with single-molecule sequencing. Approximately three thousands of samples, consisting of a variety of primary cells, tissues, cell lines, and time series samples during cell activation and development, were subjected to a uniform pipeline of CAGE data production. The analysis pipeline started by measuring RNA extracts to assess their quality, and continued to CAGE library production by using a robotic or a manual workflow, single molecule sequencing, and computational processing to generate frequencies of transcription initiation. Resulting data represents the consequence of transcriptional regulation in each analyzed state of mammalian cells. Non-overlapping peaks over the CAGE profiles, approximately 200,000 and 150,000 peaks for the human and mouse genomes, were identified and annotated to provide precise location of known promoters as well as novel ones, and to quantify their activities.

Highlights

  • Background & SummarySince the completion of the human genome sequencing, role of individual bases has been a central question

  • We focused on transitions of cell states by monitoring ‘time course’ samples, such as activations, differentiations, and developments at sequential time points[15]

  • Based on frequencies of the observed 5′ends of individual capped RNA molecules at a single base-pair resolution, we identified 201,802 and 158,966 peaks for human and mouse respectively, where promoters are defined as the sequence immediately upstream of the peaks and frequencies of observed CAGE reads reflect activities of the promoters

Read more

Summary

Introduction

Background & SummarySince the completion of the human genome sequencing, role of individual bases has been a central question. In the course of the two phases focused on ‘snapshot’ and ‘time course’ samples, we profiled 1,816 human and 1,016 mouse samples in total, and obtained approximately four millions of single-molecule reads successfully aligned to the genome per sample on average.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call