Abstract

Allele-specific expression (ASE) analysis, which quantifies the relative expression of two alleles in a diploid individual, is a powerful tool for identifying cis-regulated gene expression variations that underlie phenotypic differences among individuals. Existing methods for gene-level ASE detection analyze one individual at a time, therefore failing to account for shared information across individuals. Failure to accommodate such shared information not only reduces power, but also makes it difficult to interpret results across individuals. However, when only RNA sequencing (RNA-seq) data are available, ASE detection across individuals is challenging because the data often include individuals that are either heterozygous or homozygous for the unobserved cis-regulatory SNP, leading to sample heterogeneity as only those heterozygous individuals are informative for ASE, whereas those homozygous individuals have balanced expression. To simultaneously model multi-individual information and account for such heterogeneity, we developed ASEP, a mixture model with subject-specific random effect to account for multi-SNP correlations within the same gene. ASEP only requires RNA-seq data, and is able to detect gene-level ASE under one condition and differential ASE between two conditions (e.g., pre- versus post-treatment). Extensive simulations demonstrated the convincing performance of ASEP under a wide range of scenarios. We applied ASEP to a human kidney RNA-seq dataset, identified ASE genes and validated our results with two published eQTL studies. We further applied ASEP to a human macrophage RNA-seq dataset, identified genes showing evidence of differential ASE between M0 and M1 macrophages, and confirmed our findings by results from cardiometabolic trait-relevant genome-wide association studies. To the best of our knowledge, ASEP is the first method for gene-level ASE detection at the population level that only requires the use of RNA-seq data. With the growing adoption of RNA-seq, we believe ASEP will be well-suited for various ASE studies for human diseases.

Highlights

  • Genome-wide association studies (GWAS) are successful in identifying candidate loci for complex human diseases and traits [1, 2]

  • Allele-specific expression (ASE) quantifies the relative expression of two alleles in a diploid individual, and such expression imbalance potentially contributes to phenotypic variation and disease pathophysiology among individuals

  • Since the two alleles used to measure ASE come from the same cellular environment and genetic background, they can serve as internal control and eliminate the influence of trans-acting genetic and environmental factors

Read more

Summary

Introduction

Genome-wide association studies (GWAS) are successful in identifying candidate loci for complex human diseases and traits [1, 2]. The association peaks from GWAS typically identify a handful of gene candidates, but it is often unclear whether these candidates are expressed in relevant tissues and cell types. A commonly used approach to understand the functional roles of GWAS identified genetic variants is expression quantitative trait loci (eQTL) analysis [4, 5]. The lack of explicit information on cis- versus trans- makes it difficult to directly link to the underlying mechanism, and the requirement of a relatively large sample size for eQTL analysis further makes it impractical for studies that involve difficult-to-collect tissues [9]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call