Abstract
Expression quantitative trait loci (eQTLs) mapping and linkage disequilibrium (LD) analysis have been widely employed to interpret findings of genome-wide association studies (GWAS). With the availability of deep sequencing data of 423 lymphoblastoid cell lines (LCLs) from six global populations and the microarray expression data, we performed eQTL analysis, identified more than 228 K SNP cis-eQTLs and 21 K indel cis-eQTLs and generated a LCL cis-eQTL database. We demonstrate that the percentages of population-shared and population-specific cis-eQTLs are comparable; while indel cis-eQTLs in the population-specific subsection make more contribution to gene expression variations than those in the population-shared subsection. We found cis-eQTLs, especially the population-shared cis-eQTLs are significantly enriched toward transcription start site. Moreover, the National Human Genome Research Institute cataloged GWAS SNPs are enriched for LCL cis-eQTLs. Specifically, 32.8% GWAS SNPs are LCL cis-eQTLs, among which 12.5% can be tagged by indel cis-eQTLs, suggesting the fundamental contribution of indel cis-eQTLs to GWAS association signals. To search for functional indels and SNPs tagging GWAS SNPs, a pipeline Post-GWAS Explorer for Functional Indels and SNPs (PExFInS) has been developed, integrating LD analysis, functional annotation from public databases, cis-eQTL mapping with our LCL cis-eQTL database and other published cis-eQTL datasets.
Highlights
More than ten thousands single nucleotide polymorphisms (SNPs) have been identified to associate with complex traits and human diseases in genome-wide association studies (GWAS) in the past decade[1]
The cis-expression Quantitative Trait Loci (eQTL) analysis was performed in 423 lymphoblastoid cell lines (LCLs), including 73 CEU (Utah residents with northern and western European ancestry), 77 CHB (Han Chinese in Beijing, China), 72 JPT (Japanese in Tokyo, Japan), 80 LWK (Luhya in Webuye, Kenya), 42 MEX (Mexican ancestry in Los Angeles) and 79 YRI (Yoruba in Ibadan, Nigeria). cis-eQTLs were mapped by correlating gene expression of 14,010 unique autosomal RefSeq genes to the genotypes of 1 KG variants of phase 1 release
The causal variant(s) contributing to the disease cannot be directly inferred from the association SNPs. eQTL mapping has become a powerful tool to interpret GWAS findings and facilitate the identification of functional variants or causal genes
Summary
More than ten thousands single nucleotide polymorphisms (SNPs) have been identified to associate with complex traits and human diseases in genome-wide association studies (GWAS) in the past decade[1]. The Ensembl project has generated an expanding wealth of information including, but not limited to, gene structure, genetic variations and their consequences as well as functional genomic data. These comprehensive databases have provided the most abundant resource to functionally interpret the genetic variations in human genome. A number of tools, such as SNAP4 and LocusZoom[5] can generate LD plot for GWAS SNPs and their high LD SNPs. the LD pattern between SNPs and structure variants including small insertion/deletion (< 50 bp) and large insertion/deletion (> 1 kb) (both referred to as indel afterwards) have not been extensively examined. The SNP eQTLs in LCLs can be revealed at a higher resolution
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.