Abstract

The transcriptome plays an important role in the life of a cell. Detailed analysis of the transcriptome enables interpretation of its structure and functionality. High throughput sequencing technology significantly enhanced the understanding of transcriptome activity. The RNA-sequencing process currently provides the most accurate estimation of gene expression levels. Moreover, RNA-seq allows detection of isoform structure and novel RNA types along with transcription process details such as strand-specificity and much more. The first chapter of this thesis describes the history of transcriptome exploration and effective methods of RNA-seq application. Nevertheless, all steps of RNA-seq process can produce a number of biases that influence the investigation results. Some typical errors appearing during ligation and amplification procedures might be present in any high throughput sequencing experiment, while other biases occur only in cDNA synthesis or are specific for transcriptome activity. Quality control of sequencing data is important to verify and correct the analysis results. The second chapter of this thesis is devoted to the explanation of these issues and introduces a novel tool, Qualimap 2. This instrument computes detailed statistics and presents a number of plots based on RNA-seq alignment and counts data processing. The generated results enable detection of problems that are specific to RNA-seq experiments. Notably, the tool supports analysis of multiple samples in various conditions. Qualimap 2 was faithfully compared to other available tools and demonstrated superior functionality in multi-sample quality control. Importantly, RNA-seq can be applied in a relatively novel research area: detection of chimeric transcripts and fusion genes occurring due to genomic rearrangement. Since fusions are related to cancer, their discovery is important not only for science, but also allows medical use of RNA-seq. The third chapter is devoted to the current status of this approach and illustrates a novel toolkit called InFusion, which provides a number of novelties in chimera discovery from RNA- seq data such as detection of fusions arising from the combination of a gene and an intronic or intergenic region. Moreover, strand-specificity of expressed fusion transcripts can be detected and reported. InFusion was compared in detail to a number of other existing tools based on simulated and real datasets and demonstrated higher precision and recall. Overall, RNA- sequencing technology goes further and more specialized analysis abilities are becoming available. New applications of RNA sequencing and future directions of research are discussed in the last chapter.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.