RNA molecules are not merely a combination of four bases of A, C, G, and U. Chemical modifications occur in almost all RNA species and play diverse roles in gene expression regulation. The abundant cellular RNAs, such as ribosomal RNA (rRNA) and transfer RNA (tRNA), are known to have the highest density of RNA modifications, which exert critical functions in rRNA and tRNA biogenesis, stability, and subsequent translation. In recent years, modifications on low-abundance RNA species in mammalian cells, such as messenger RNA (mRNA), regulatory noncoding RNA (ncRNA), and chromatin-associated RNA (caRNA), have been shown to contain multiple different chemical modifications with functional significance. As the most abundant mRNA modification in mammals, N6-methyladenosine (m6A) affects nearly every stage of mRNA processing and metabolism, with the antibody-based m6A-MeRIP-seq (methylated RNA immunoprecipitation sequencing) followed by high-throughput sequencing widely employed in mapping m6A distribution transcriptome-wide in diverse biological systems. In addition to m6A, other chemical modifications such as pseudouridine (Ψ), 2'-O-methylation (Nm), 5-methylcytidine (m5C), internal N7-methylguanosine (m7G), N1-methyladenosine (m1A), N4-acetylcytidine (ac4C), etc. also exist in polyA-tailed RNA in mammalian cells, requiring effective mapping approaches for whole-transcriptome profiling of these non-m6A mRNA modifications. Like m6A, the antibody-based enrichment followed by sequencing has been the primary method to study distributions of these modifications. Methods to more quantitatively map these modifications would dramatically improve our understanding of distributions and modification density of these chemical marks on RNA, thereby bettering informing functional implications. In this Account, aimed at both single-base resolution and modification fraction quantification, we summarize our recent advances in developing a series of chemistry- or biochemistry-based methods to quantitatively map RNA modifications, including m6A, Ψ, m5C, m1A, 2'-O-methylation (Nm), and internal m7G, in mammalian mRNA at base resolution. These new methods, including m6A-SAC-seq, eTAM-seq, BID-seq, UBS-seq, DAMM-seq, m1A-quant-seq, Nm-Mut-seq, and m7G-quant-seq, promise to conduct base-resolution mapping of most major mRNA modifications with low RNA input and uncover dynamic changes in modification stoichiometry during biological and physiological processes, facilitating future investigations on these RNA modifications in regulating cellular gene expression and as potential biomarkers for clinical diagnosis and prognosis. These quantitative sequencing methods allow the mapping of most mRNA modifications with limited input sample requirements. The same modifications on diverse RNA species, such as caRNA, ncRNA, nuclear nascent RNA, mitochondrial RNA, cell-free RNA (cfRNA), etc., could be sequenced using the same methods.
Read full abstract