Abstract

Genome-wide gene expression quantitative trait loci (eQTL) mapping have been focused on single-nucleotide polymorphisms and have helped interpret findings from diseases mapping studies. The functional effect of structure variants, especially short insertions and deletions (indel) has not been well investigated. Here we impute 1,380,133 indels based on the latest 1,000 Genomes Project panel into three eQTL data sets from multiple tissues. Imputation of indels increased 9.9% power and identifies indel-specific eQTLs for 325 genes. We find introns and vicinities of UTRs are more enriched of indel eQTLs and 3.6 (single-tissue)-9.2%(multi-tissue) of previous identified eSNPs were taggers of eindels. Functional analyses identifies epigenetics marks, gene ontology categories and disease GWAS loci affected by SNPs and indels eQTLs showing tissue-consistent or tissue-specific effects. This study provides new insights into the underlying genetic architecture of gene expression across tissues and new resource to interpret function of diseases and traits associated structure variants.

Highlights

  • Genome-wide gene expression quantitative trait loci mapping have been focused on single-nucleotide polymorphisms and have helped interpret findings from diseases mapping studies

  • After quality control on genotypes and expression, 376,877 single-nucleotide polymorphism (SNP) from the lymphoblastoid cell lines (LCLs) expression data set, 687,364 from the peripheral blood mononuclear cells (PBMC) expression data set, 433,964 from the skin expression data set, as well as 51,190 gene expression probe sets remain for downstream analysis

  • Comparing the red and the blue bars, we found that 3.62% of previously identified SNP QTL in LCL were likely to be tagging the indel expression quantitative trait loci (eQTL) of the same gene in LCL

Read more

Summary

Introduction

Genome-wide gene expression quantitative trait loci (eQTL) mapping have been focused on single-nucleotide polymorphisms and have helped interpret findings from diseases mapping studies. A recent study based on 179 sequenced samples from the 1,000G has shown that indels are generally subject to stronger purifying selection than SNPs and they are enriched in associations with gene expression[8] Imputation of these newly identified genetic variants into existing GWAS may help identifying novel disease loci not discovered by previous genotyping platforms and imputation. Large-scale gene expression data, which provide complex traits with full spectrum of heritability and genetic architecture, is ideal for evaluation of the power of association study using imputation of the newly identified indels This information will be useful to the research community as to what should be expected from the imputation of indels and guide the design of genotyping platforms for the next-generation association studies

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call