Abstract

BackgroundMeasuring allele-specific RNA expression provides valuable insights into cis-acting genetic and epigenetic regulation of gene expression. Widespread adoption of high-throughput sequencing technologies for studying RNA expression (RNA-Seq) permits measurement of allelic RNA expression imbalance (AEI) at heterozygous single nucleotide polymorphisms (SNPs) across the entire transcriptome, and this approach has become especially popular with the emergence of large databases, such as GTEx. However, the existing binomial-type methods used to model allelic expression from RNA-seq assume a strong negative correlation between reference and variant allele reads, which may not be reasonable biologically.ResultsHere we propose a new strategy for AEI analysis using RNA-seq data. Under the null hypothesis of no AEI, a group of SNPs (possibly across multiple genes) is considered comparable if their respective total sums of the allelic reads are of similar magnitude. Within each group of “comparable” SNPs, we identify SNPs with AEI signal by fitting a mixture of folded Skellam distributions to the absolute values of read differences. By applying this methodology to RNA-Seq data from human autopsy brain tissues, we identified numerous instances of moderate to strong imbalanced allelic RNA expression at heterozygous SNPs. Findings with SLC1A3 mRNA exhibiting known expression differences are discussed as examples.ConclusionThe folded Skellam mixture model searches for SNPs with significant difference between reference and variant allele reads (adjusted for different library sizes), using information from a group of “comparable” SNPs across multiple genes. This model is particularly suitable for performing AEI analysis on genes with few heterozygous SNPs available from RNA-seq, and it can fit over-dispersed read counts without specifying the direction of the correlation between reference and variant alleles.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-015-1749-0) contains supplementary material, which is available to authorized users.

Highlights

  • Measuring allele-specific RNA expression provides valuable insights into cis-acting genetic and epigenetic regulation of gene expression

  • To present the potential of decomposing signals from RNA-seq data using the mixture model pipeline, we consider the dataset described above in which we focus only on pairs of counts with at least 3 reads for the allele with lower expression (min(R,V) ≥ 3) and exclude intergenic single nucleotide polymorphisms (SNPs)

  • The method is useful when scanning for allelic RNA expression imbalance (AEI) signals in RNA-seq datasets having a large number of genes with small number of heterozygous SNPs (

Read more

Summary

Introduction

Measuring allele-specific RNA expression provides valuable insights into cis-acting genetic and epigenetic regulation of gene expression. High-throughput DNA sequencing technology, when used for measuring RNA expression (RNA-Seq), provides nucleotide-level resolution of gene expression across the entire transcriptome in a single experiment. This enhanced resolution provides a wealth of detail about gene expression not available through microarray-based technologies. Genomic regions subject to epigenetic programming, such as imprinting, which typically results in large (>10-fold) AEI because of near-complete silencing of one allele, have been identified from RNA-Seq studies of allelic RNA expression in combination with gDNA genotyping [7, 8]. RNAseq data yield allelic ratios with relatively high noise; rigorous statistical methods are needed to identify a signature of AEI in transcriptome-wide analyses

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.