Abstract

RNA editing by adenosine deaminases changes the information encoded in the mRNA from its genomic blueprint. Editing of protein-coding sequences can introduce novel, functionally distinct, protein isoforms and diversify the proteome. The functional importance of a few recoding sites has been appreciated for decades. However, systematic methods to uncover these sites perform poorly, and the full repertoire of recoding in human and other mammals is unknown. Here we present a new detection approach, and analyze 9125 GTEx RNA-seq samples, to produce a highly-accurate atlas of 1517 editing sites within the coding region and their editing levels across human tissues. Single-cell RNA-seq data shows protein recoding contributes to the variability across cell subpopulations. Most highly edited sites are evolutionary conserved in non-primate mammals, attesting for adaptation. This comprehensive set can facilitate understanding of the role of recoding in human physiology and diseases.

Highlights

  • RNA editing by adenosine deaminases changes the information encoded in the mRNA from its genomic blueprint

  • Genomic variability between the reference genome and the sampled individuals translates into mismatches between the reference genome and the RNA sequenced from these individuals

  • We applied strict alignment procedures, discarding mismatches that are likely to be explained by systematic alignment or sequencing errors (Supplementary Fig. 2), and a statistical model that integrates the cumulative profile of mismatches found for all donors of a given tissue type into a single score

Read more

Summary

Introduction

RNA editing by adenosine deaminases changes the information encoded in the mRNA from its genomic blueprint. We present a new detection approach, and analyze 9125 GTEx RNA-seq samples, to produce a highly-accurate atlas of 1517 editing sites within the coding region and their editing levels across human tissues. Most highly edited sites are evolutionary conserved in non-primate mammals, attesting for adaptation This comprehensive set can facilitate understanding of the role of recoding in human physiology and diseases. Recoding activity is dwarfed by A-to-I editing events at millions of sites within non-coding regions[5], and is much more difficult to detect (Fig. 1a). Systematic analyses of mismatches yielded high-specificity identification of sites in human[5,11–17], virtually all of them within Alu repeats, but performed poorly in coding regions. Systematic misalignment of RNA-seq reads to the wrong (but homologous) genomic locus could lead to an apparently consistent mismatch, to be misidentified as an editing event[21–23]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call