Abstract

Allele expression (AE) analysis robustly measures cis-regulatory effects. Here, we present and demonstrate the utility of a vast AE resource generated from the GTEx v8 release, containing 15,253 samples spanning 54 human tissues for a total of 431 million measurements of AE at the SNP level and 153 million measurements at the haplotype level. In addition, we develop an extension of our tool phASER that allows effect sizes of cis-regulatory variants to be estimated using haplotype-level AE data. This AE resource is the largest to date, and we are able to make haplotype-level data publicly available. We anticipate that the availability of this resource will enable future studies of regulatory variation across human tissues.

Highlights

  • Allelic expression (AE, known as allele-specific expression or ASE) analysis is a powerful technique that can be used to measure the expression of gene alleles relative to one another within single individuals

  • We developed an addition to phASER, called phASER-POP which makes it easy to generate population-scale, haplotype-level Allele expression (AE) data and calculate effect sizes for regulatory variants. Both single nucleotide polymorphisms (SNPs)-level and haplotype-level AE data were generated for each Genotype Tissue Expression (GTEx) sample using current best practices, both with and without using WASP filtering [8] to reduce the mapping bias that is sometimes present in AE analysis, resulting in 4 data types per sample (Additional file 1: Fig. S1, “Data generation and availability” section in the “Methods” section)

  • To demonstrate the ability of these data to robustly capture cis-regulatory effects and benchmark the four data types relative to one another, we estimated eQTL effect sizes across the 49 tissues where eQTLs were mapped from AE data using allelic fold change and compared them to those derived from eQTL mapping [7]

Read more

Summary

Introduction

Allelic expression (AE, known as allele-specific expression or ASE) analysis is a powerful technique that can be used to measure the expression of gene alleles relative to one another within single individuals. AE analysis uses RNA-seq reads that overlap heterozygous single nucleotide polymorphisms (SNPs), where the SNP can be used to assign the read to an allele These heterozygous SNPs capture the cumulative effects of cis-regulatory variation acting on each allele. The magnitude of the imbalance can be quantified by allelic fold change (aFC) [1], and the statistical significance of the imbalance can be evaluated using binomialbased statistics to account for the count-based nature of the data [4] In some cases, these effects can be caused by the SNPs being used to measure AE themselves, for example, Castel et al Genome Biology (2020) 21:234

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call