Epitranscriptomic modifications, particularly N6-methyladenosine (m6A), are crucial regulators of gene expression, influencing processes such as RNA stability, splicing, and translation. Traditional computational methods for detecting m6A from Nanopore direct RNA sequencing (DRS) data are constrained by their reliance on experimentally validated labels, often resulting in the underestimation of modification sites. Here, we introduce pum6a, an innovative attention-based framework that integrates positive and unlabeled multi-instance learning (MIL) to address the challenges of incomplete labeling and missing read-level annotations. By combining electrical signal features with base alignment data and employing a weighted Noisy-OR probability mechanism, pum6a achieves enhanced sensitivity and accuracy in m6A detection, particularly in low-coverage loci. Pum6a outperforms existing methods in identifying m6A sites across various cell lines and species, without requiring extensive parameter tuning. We further apply pum6a to study the dynamic regulation of m6A demethylases in gastric cancer under hypoxia, revealing distinct roles for FTO and ALKBH5 in modulating m6A modifications and uncovering key insights into m6A -mediated transcript stability. Our findings highlight the potential of pum6a as a powerful tool for advancing the understanding of epitranscriptomic regulation in health and disease, paving the way for biotechnological and therapeutic applications.
Read full abstract