Abstract
Accurately quantifying gene and isoform expression changes is essential to understanding cell functions, differentiation and disease. Sequencing full-length native RNAs using long-read direct RNA sequencing (DRS) has the potential to overcome many limitations of short and long-read sequencing methods that require RNA fragmentation, cDNA synthesis or PCR. However, there are a lack of tools specifically designed for DRS and its ability to identify differential expression in complex organisms is poorly characterised. We developed NanoCount for fast, accurate transcript isoform quantification in DRS and demonstrate it outperforms similar methods. Using synthetic controls and human SH-SY5Y cell differentiation into neuron-like cells, we show that DRS accurately quantifies RNA expression and identifies differential expression of genes and isoforms. Differential expression of 231 genes, 333 isoforms, plus 27 isoform switches were detected between undifferentiated and differentiated SH-SY5Y cells and samples clustered by differentiation state at the gene and isoform level. Genes upregulated in neuron-like cells were associated with neurogenesis. NanoCount quantification of thousands of novel isoforms discovered with DRS likewise enabled identification of their differential expression. Our results demonstrate enhanced DRS isoform quantification with NanoCount and establish the ability of DRS to identify biologically relevant differential expression of genes and isoforms.
Highlights
Cellular fates and functions are underpinned by the expression of protein-coding and non-coding genes into RNA
Sequin RNAs vary in abundance over >4 orders of magnitude and come in two mixes, each mix contains the same synthetic RNA isoforms but their concentrations are offset by known amounts
We developed NanoCount for the accurate quantification of transcript isoforms and performed an in-depth analysis of direct RNA sequencing (DRS) using synthetic spike-in RNAs and human SH-SY5Y (5Y) neuroblastoma cells
Summary
Cellular fates and functions are underpinned by the expression of protein-coding and non-coding genes into RNA (termed the transcriptome). The expression profiles of individual genes can vary in complex ways to regulate their functional outputs. Expression of genes can be switched on or off, increased or decreased, while the RNA products (transcript isoforms) made from individual genes can vary extensively. In humans >90% of protein-coding genes express multiple RNA isoforms via processes such as alternative transcriptional start sites, termination sites and splicing, greatly increasing the diversity of the transcriptome and proteome within cells [1,2]. Expression of different genes and isoforms drive cellular differentiation programs, control cell and tissue functions and allow cells to respond to their environment [3,4]. Aberrant expression contributes to various diseases including neurological disorders, autoimmune disorders and cancer [5,6,7]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.