Workflow for Genome-Wide Determination of Pre-mRNA Splicing Efficiency from Yeast RNA-seq Data.

Martin Převorovský,Petr Folk,Kateřina Abrhámová,Jiří Libus,Martina Hálová

doi:10.1155/2016/4783841

Martin Převorovský, Petr Folk + Show 3 more

Open Access

https://doi.org/10.1155/2016/4783841

Copy DOI

Journal: BioMed research international	Publication Date: Jan 1, 2016
Citations: 12	License type: CC BY 4.0

Affiliation: Charles University

Abstract

Pre-mRNA splicing represents an important regulatory layer of eukaryotic gene expression. In the simple budding yeast Saccharomyces cerevisiae, about one-third of all mRNA molecules undergo splicing, and splicing efficiency is tightly regulated, for example, during meiotic differentiation. S. cerevisiae features a streamlined, evolutionarily highly conserved splicing machinery and serves as a favourite model for studies of various aspects of splicing. RNA-seq represents a robust, versatile, and affordable technique for transcriptome interrogation, which can also be used to study splicing efficiency. However, convenient bioinformatics tools for the analysis of splicing efficiency from yeast RNA-seq data are lacking. We present a complete workflow for the calculation of genome-wide splicing efficiency in S. cerevisiae using strand-specific RNA-seq data. Our pipeline takes sequencing reads in the FASTQ format and provides splicing efficiency values for the 5′ and 3′ splice junctions of each intron. The pipeline is based on up-to-date open-source software tools and requires very limited input from the user. We provide all relevant scripts in a ready-to-use form. We demonstrate the functionality of the workflow using RNA-seq datasets from three spliceosome mutants. The workflow should prove useful for studies of yeast splicing mutants or of regulated splicing, for example, under specific growth conditions.

Highlights

In eukaryotes, coding parts of genes, the exons, are interrupted by noncoding parts, the introns
Sequencing reads from strandspecific transcriptome profiling of splicing mutants (prp45(1169), prp4-1, and prp40-1) and their corresponding wild-type S. cerevisiae strains [17, 24] were downloaded from the European Nucleotide Archive in FASTQ format and experiment metadata were obtained from ArrayExpress
After input quality control (FastQC), reads are mapped into S. cerevisiae reference genome (HISAT2, [29]) and filtered, keeping only uniquely mapped reads

Summary

Introduction

In eukaryotes, coding parts of genes, the exons, are interrupted by noncoding parts, the introns. The process through which introns are removed and exons are joined together is called splicing It occurs via two consecutive transesterification reactions which are catalysed by the spliceosome, a large dynamic ribonucleoprotein complex composed of five snRNP particles (U1, U2, U4/U6, and U5) and other associated protein complexes, like the Nineteen Complex (NTC in yeast; CDC5L in mammals) (reviewed in [1]). Additional sequences are needed for recruiting various transacting regulatory factors, which modulate the binding of spliceosome subunits and splice site choice and efficiency, deciding on the splicing outcome. This is important especially for alternative splicing (reviewed in [4])

Methods

Results

Conclusion