Abstract

Enzyme-mediated chemical modifications to mRNA are important for fine-tuning gene expression, but they are challenging to quantify due to low copy number and limited tools for accurate detection. The pseudouridine (ψ) modification is highly abundant in mRNAs but is difficult to detect and quantify because there is no available antibody, it is mass silent, and maintains canonical basepairing with adenine. The presence of another uridine modification, dihydrouridine (DHU), was recently demonstrated in mRNA through chemical labeling. We and others have recently shown that nanopores may be used to qualitatively identify uridine modifications in direct RNA sequencing by alignment to an unmodified transcriptome. In this work, we apply supervised machine learning models that are trained on sequence-specific modified and unmodified synthetic controls to endogenous transcriptome data, to achieve site-specific uridine modification quantification. Our models reveal that for every site studied, different signal parameters are required to maximize the accuracy of ψ and DHU classification, suggesting that application transcriptome-wide would require more standards. We show that applying our model is critical for quantification, particularly for low-abundance mRNAs. Our engine, used here to quantitatively profile ψ and DHU occupancy in specific human mRNA sites, can be implemented across cell types and cell states to determine the occupancy of uridine modifications, thus providing critical insights into physiological relevance of ψ and DHU to mRNAs. With a further library of appropriate molecular standards, our method has the potential to quantify all putative uridine modifications in the human transcriptome.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call