Abstract
RNA sequencing using next-generation sequencing technologies (NGS) is currently the standard approach for gene expression profiling, particularly for large-scale high-throughput studies. NGS technologies comprise high throughput, cost efficient short-read RNA-Seq, while emerging single molecule, long-read RNA-Seq technologies have enabled new approaches to study the transcriptome and its function. The emerging single molecule, long-read technologies are currently commercially available by Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT), while new methodologies based on short-read sequencing approaches are also being developed in order to provide long range single molecule level information—for example, the ones represented by the 10x Genomics linked read methodology. The shift toward long-read sequencing technologies for transcriptome characterization is based on current increases in throughput and decreases in cost, making these attractive for de novo transcriptome assembly, isoform expression quantification, and in-depth RNA species analysis. These types of analyses were challenging with standard short sequencing approaches, due to the complex nature of the transcriptome, which consists of variable lengths of transcripts and multiple alternatively spliced isoforms for most genes, as well as the high sequence similarity of highly abundant species of RNA, such as rRNAs. Here we aim to focus on single molecule level sequencing technologies and single-cell technologies that, combined with perturbation tools, allow the analysis of complete RNA species, whether short or long, at high resolution. In parallel, these tools have opened new ways in understanding gene functions at the tissue, network, and pathway levels, as well as their detailed functional characterization. Analysis of the epi-transcriptome, including RNA methylation and modification and the effects of such modifications on biological systems is now enabled through direct RNA sequencing instead of classical indirect approaches. However, many difficulties and challenges remain, such as methodologies to generate full-length RNA or cDNA libraries from all different species of RNAs, not only poly-A containing transcripts, and the identification of allele-specific transcripts due to current error rates of single molecule technologies, while the bioinformatics analysis on long-read data for accurate identification of 5′ and 3′ UTRs is still in development.
Highlights
RNA sequencing (RNA-Seq) using short-read sequencing technologies currently offered by Illumina or Thermo Fisher (Ion Torrent) represents the standard and widely used method for transcriptome profiling (Goodwin et al, 2016)
The long-read lengths achieved with this technology, coupled with the IsoSeq RNA sequencing protocol discussed below and downstream data analysis pipelines developed by Pacific Biosciences (PacBio) provides a powerful approach to RNA analysis
It is expected that in a PacBio read of a given length, if it has been produced from a short cDNA isoform, the sequence of this short isoform will be present on the PacBio circular-consensus sequence (CCS) read many more times than the sequence of a long cDNA isoform if the PacBio read has been produced from this long isoform
Summary
RNA sequencing (RNA-Seq) using short-read sequencing technologies currently offered by Illumina or Thermo Fisher (Ion Torrent) represents the standard and widely used method for transcriptome profiling (Goodwin et al, 2016). Another sequencing technology from MGI (DNBSEQ), which is based on the formation of DNA nanoballs (Huang et al, 2017), has been used for RNA-seq studies and has shown a comparable performance in terms of quantification of gene expression and technical variability to the Illumina platform (Jeon et al, 2019; Natarajan et al, 2019). Pioneered by the PacBio “Iso-Seq” method, this approach involves mainly the characterization of the different isoform models by sequencing groups of cDNA reads after fractionating them based on their length (Au et al, 2013). We will present the library preparation methods that are exploiting these properties to enrich for the different categories
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.