Trait correlated expression combined with eQTL and ASE analyses identified novel candidate genes affecting intramuscular fat
BackgroundIntramuscular fat (IMF) content is a determining factor for meat taste. The Luchuan pig is a fat-type local breed in southern China that is famous for its desirable meat quality due to high IMF, however, the crossbred offspring of Luchuan sows and Duroc boars displayed within-population variation on meat quality, and the reason remains unknown.ResultsIn the present study, we identified 212 IMF-correlated genes (FDR ≤ 0.01) using correlation analysis between gene expression level and the value of IMF content. The IMF-correlated genes were significantly enriched in the processes of lipid metabolism and mitochondrial energy metabolism, as well as the AMPK/PPAR signaling pathway. From the IMF-correlated genes, we identified 99 genes associated with expression quantitative trait locus (eQTL) or allele-specific expression (ASE) signals, including 21 genes identified by both cis-eQTL and ASE analyses and 12 genes identified by trans-eQTL analysis. Genome-wide association study (GWAS) of IMF identified a significant QTL on SSC14 (p-value = 2.51E−7), and the nearest IMF-correlated gene SFXN4 (r = 0.28, FDR = 4.00E−4) was proposed as the candidate gene. Furthermore, we highlighted another three novel IMF candidate genes, namely AGT, EMG1, and PCTP, by integrated analysis of GWAS, eQTL, and IMF-gene correlation analysis.ConclusionsThe AMPK/PPAR signaling pathway together with the processes of lipid and mitochondrial energy metabolism plays a vital role in regulating porcine IMF content. Trait correlated expression combined with eQTL and ASE analysis highlighted a priority list of genes, which compensated for the shortcoming of GWAS, thereby accelerating the mining of causal genes of IMF.
- Research Article
49
- 10.1186/s12711-020-00579-x
- Oct 9, 2020
- Genetics Selection Evolution
BackgroundGenetic analysis of gene expression level is a promising approach for characterizing candidate genes that are involved in complex economic traits such as meat quality. In the present study, we conducted expression quantitative trait loci (eQTL) and allele-specific expression (ASE) analyses based on RNA-sequencing (RNAseq) data from the longissimus muscle of 189 Duroc × Luchuan crossed pigs in order to identify some candidate genes for meat quality traits.ResultsUsing a genome-wide association study based on a mixed linear model, we identified 7192 cis-eQTL corresponding to 2098 cis-genes (p ≤ 1.33e-3, FDR ≤ 0.05) and 6400 trans-eQTL corresponding to 863 trans-genes (p ≤ 1.13e-6, FDR ≤ 0.05). ASE analysis using RNAseq SNPs identified 9815 significant ASE-SNPs in 2253 unique genes. Integrative analysis between the cis-eQTL and ASE target genes identified 540 common genes, including 33 genes with expression levels that were correlated with at least one meat quality trait. Among these 540 common genes, 63 have been reported previously as candidate genes for meat quality traits, such as PHKG1 (q-value = 1.67e-6 for the leading SNP in the cis-eQTL analysis), NUDT7 (q-value = 5.67e-13), FADS2 (q-value = 8.44e-5), and DGAT2 (q-value = 1.24e-3).ConclusionsThe present study confirmed several previously published candidate genes and identified some novel candidate genes for meat quality traits via eQTL and ASE analyses, which will be useful to prioritize candidate genes in further studies.
- Research Article
- 10.1158/1538-7445.am2019-1584
- Jul 1, 2019
- Cancer Research
Background: Genome-wide association study (GWAS) have identified over 45 susceptibility loci for lung cancer; many studies including our own group, have focused on low-frequency and rare coding variants using fine mapping and exome sequencing. This strategy, however, has met with limited success as about 90% of GWAS hits are noncoding and act primarily through altering transcriptional regulation in an allele-specific manner. The RNA-Seq based allele-specific expression (ASE) analysis affords an innovative approach to study preferential expression of an allele in direct relationship to its genotype, providing information on cis-regulatory effects for the expression of putative genes. However currently, there are no lung cancer studies that have rigorously evaluated the ASE variation in lung tumor and adjacent tissues. Methods: Leveraging The Cancer Genome Atlas (TCGA) resource, we performed transcriptomic-wide ASE analysis using existing RNA-Seq datasets of paired tumor and adjacent tissues from 54 lung adenocarcinoma patients. We first quantified the RNA read counts of Referent and Alternate alleles of heterozygous variants, then evaluated the allelic imbalance on a per-sample basis using Beta-binomial test, and explored the differential ASE between tumor and adjacent tissues using paired Wilcoxon test. Functional regulatory consequences were generated from Ensembl Variant Effect Predictor. Results: We identified total 208 significant ASEs, including 35 tissue-specific (only in tumor or only in adjacent), 28 sharing, and 145 differential variants. Of the 208 candidates, 41 were from the human leukocyte antigen (HLA) locus (primary DQA2, DQB1, DRB1, H and J), 26 were from the immunoglobulin (IG) superfamily (primary IGH, IGL, IGK and F11R). About 80% candidates were noncoding (mostly in 5’ and 3’ untranslated regions) and with regulatory features (21 promoter, seven enhancer, seven open chromatin region, two induce nonsense-mediated mRNA decay, one CTCF-binding site, and one transcription factor binding site). Other top genes included MDM2, APOL1, and CTSB. Pathway analyses revealed 27 genes involved in immune response pathway, and 12 genes involved in HLA antigen processing and presentation pathway. Conclusion: This study is the first transcriptomics ASE analysis in lung adenocarcinoma. The key somatic cis-regulatory ASE variants identified from this study, especially immunogenic allelic variations from HLA and IG genes, could be used for identifying high-risk individuals for targeted lung cancer checkpoint blockade and related immunotherapies. Citation Format: Yanhong Liu, Spiridon Tsavachidis, Farrah Kheradmand, Margaret R. Spitz, Chris Amos. Transcriptome analysis links immune genes allelic expression imbalances to lung cancer [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2019; 2019 Mar 29-Apr 3; Atlanta, GA. Philadelphia (PA): AACR; Cancer Res 2019;79(13 Suppl):Abstract nr 1584.
- Research Article
19
- 10.1038/s41598-023-27591-7
- Jan 11, 2023
- Scientific Reports
Allele-specific expression (ASE) analysis detects the relative abundance of alleles at heterozygous loci as a proxy for cis-regulatory variation, which affects the personal transcriptome and proteome. This study describes the development and application of an ASE analysis pipeline on a unique cohort of 87 well phenotyped and RNA sequenced patients from the Maastricht Cardiomyopathy Registry with dilated cardiomyopathy (DCM), a complex genetic disorder with a remaining gap in explained heritability. Regulatory processes for which ASE is a proxy might explain this gap. We found an overrepresentation of known DCM-associated genes among the significant results across the cohort. In addition, we were able to find genes of interest that have not been associated with DCM through conventional methods such as genome-wide association or differential gene expression studies. The pipeline offers RNA sequencing data processing, individual and population level ASE analyses as well as group comparisons and several intuitive visualizations such as Manhattan plots and protein–protein interaction networks. With this pipeline, we found evidence supporting the case that cis-regulatory variation contributes to the phenotypic heterogeneity of DCM. Additionally, our results highlight that ASE analysis offers an additional layer to conventional genomic and transcriptomic analyses for candidate gene identification and biological insight.
- Research Article
1
- 10.1101/2024.08.13.607784
- Jan 15, 2025
- bioRxiv
Single-cell RNA-seq (scRNA-seq) is emerging as a powerful tool for understanding gene function across diverse cells. Recently, this has included the use of allele-specific expression (ASE) analysis to better understand how variation in the human genome affects RNA expression at the single-cell level. We reasoned that because intronic reads are more prevalent in single-nucleus RNA-Seq (snRNA-Seq), and introns are under lower purifying selection and thus enriched for genetic variants, that snRNA-seq should facilitate single-cell analysis of ASE. Here we demonstrate how experimental and computational choices can improve the results of allelic imbalance analysis. We explore how experimental choices, such as RNA source, read length, sequencing depth, genotyping, etc., impact the power of ASE-based methods. We developed a new suite of computational tools to process and analyze scRNA-seq and snRNA-seq for ASE. As hypothesized, we extracted more ASE information from reads in intronic regions than those in exonic regions and show how read length can be set to increase power. Additionally, hybrid selection improved our power to detect allelic imbalance in genes of interest. We also explored methods to recover allele-specific isoform expression levels from both long- and short-read snRNA-seq. To further investigate ASE in the context of human disease, we applied our methods to a Parkinson’s disease cohort of 94 individuals and show that ASE analysis had more power than eQTL analysis to identify significant SNP/gene pairs in our direct comparison of the two methods. Overall, we provide an end-to-end experimental and computational approach for future studies.
- Research Article
39
- 10.1186/s12864-017-4354-6
- Dec 1, 2017
- BMC Genomics
BackgroundEfforts to improve sustainability in livestock production systems have focused on two objectives: investigating the genetic control of immune function as it pertains to robustness and disease resistance, and finding predictive markers for use in breeding programs. In this context, the peripheral blood transcriptome represents an important source of biological information about an individual’s health and immunological status, and has been proposed for use as an intermediate phenotype to measure immune capacity. The objective of this work was to study the genetic architecture of variation in gene expression in the blood of healthy young pigs using two approaches: an expression genome-wide association study (eGWAS) and allele-specific expression (ASE) analysis.ResultsThe blood transcriptomes of 60-day-old Large White pigs were analyzed by expression microarrays for eGWAS (242 animals) and by RNA-Seq for ASE analysis (38 animals). Using eGWAS, the expression levels of 1901 genes were found to be associated with expression quantitative trait loci (eQTLs). We recovered 2839 local and 1752 distant associations (Single Nucleotide Polymorphism or SNP located less or more than 1 Mb from expression probe, respectively). ASE analyses confirmed the extensive cis-regulation of gene transcription in blood, and revealed allelic imbalance in 2286 SNPs, which affected 763 genes. eQTLs and ASE-genes were widely distributed on all chromosomes. By analyzing mutually overlapping eGWAS results, we were able to describe putative regulatory networks, which were further refined using ASE data. At the functional level, genes with genetically controlled expression that were detected by eGWAS and/or ASE analyses were significantly enriched in biological processes related to RNA processing and immune function. Indeed, numerous distant and local regulatory relationships were detected within the major histocompatibility complex region on chromosome 7, revealing ASE for most class I and II genes.ConclusionsThis study represents, to the best of our knowledge, the first genome-wide map of the genetic control of gene expression in porcine peripheral blood. These results represent an interesting resource for the identification of genetic markers and blood biomarkers associated with variations in immunity traits in pigs, as well as any other complex traits for which blood is an appropriate surrogate tissue.
- Research Article
- 10.1093/jas/skz122.035
- Jul 29, 2019
- Journal of Animal Science
Advancements in sequencing technology, improvements in genome annotation, and development of quantitative genetic models have been instrumental to the significant genetic gains achieved in pork production. Several quantitative trait loci (QTL) have been identified for growth, meat quality and carcass composition (GMC) phenotypes, however, the biological mechanisms underlying most QTL remain unknown. Functional genomic analysis can reveal insights on the genetic architecture of complex traits, and transcriptomic profiling of skeletal muscle during the conversion of muscle to meat can identify critical regulators of GMC phenotypes. Gene transcripts obtained with RNA-seq of longissimus muscle from 168 pigs were used to estimate gene expression variation subject to genetic control by mapping expression QTL (eQTL) and allele-specific expression (ASE). A total of 334 eQTL were mapped (FDR≤0.01) and joint association of eQTL with phenotypic QTL (pQTL) segregating in our population revealed 16 genes significantly associated with 21 pQTL for GMC phenotypes. ASE analysis facilitates the identification of cis-acting regulation of transcript abundance. We tested for ASE in 69,502 coding SNP (cSNP) and a total of 18,234 cSNP with significant ASE were identified (FDR≤0.01). A gene-wise conditional analysis fitting all ASE cSNP per gene for each phenotype identified 60 genes associated with GMC phenotypes. A comparison of eQTL with ASE cSNP showed an overlap of 136 genes. Pearson correlations of peak eQTL SNP with ASE cSNP was significant for 51% of these genes. The ASE analysis showed more precision in the identification of cis-acting effects than the genome-wide eQTL analysis; however, both approaches provide valuable information on the regulation of transcript abundance. For instance, we observed 24 genes associated with distant eQTL (trans effects) and exhibiting ASE. This study provides new information on the complex regulation of the pig longissimus muscle transcriptome and associations with measurable differences in economically important phenotypic traits.
- Research Article
- 10.21769/bioprotoc.4832
- Jan 1, 2023
- Bio-protocol
Many single nucleotide polymorphisms (SNPs) identified by genome-wide association studies exert their effects on disease risk as expression quantitative trait loci (eQTL) via allele-specific expression (ASE). While databases for probing eQTLs in tissues from normal individuals exist, one may wish to ascertain eQTLs or ASE in specific tissues or disease-states not characterized in these databases. Here, we present a protocol to assess ASE of two possible target genes (GPNMB and KLHL7) of a known genome-wide association study (GWAS) Parkinson's disease (PD) risk locus in postmortem human brain tissue from PD and neurologically normal individuals. This was done using a sequence of RNA isolation, cDNA library generation, enrichment for transcripts of interest using customizable cDNA capture probes, paired-end RNA sequencing, and subsequent analysis. This method provides increased sensitivity relative to traditional bulk RNAseq-based and a blueprint that can be extended to the study of other genes, tissues, and disease states. Key features • Analysis of GPNMB allele-specific expression (ASE) in brain lysates from cognitively normal controls (NC) and Parkinson's disease (PD) individuals. • Builds on the ASE protocol of Mayba et al. (2014) and extends application from cells to human tissue. • Increased sensitivity by enrichment for desired transcript via RNA CaptureSeq (Mercer et al., 2014). • Optimized for human brain lysates from cingulate gyrus, caudate nucleus, and cerebellum.
- Research Article
40
- 10.1371/journal.pone.0052260
- Dec 26, 2012
- PLoS ONE
A large number of genome-wide association studies have been performed during the past five years to identify associations between SNPs and human complex diseases and traits. The assignment of a functional role for the identified disease-associated SNP is not straight-forward. Genome-wide expression quantitative trait locus (eQTL) analysis is frequently used as the initial step to define a function while allele-specific gene expression (ASE) analysis has not yet gained a wide-spread use in disease mapping studies. We compared the power to identify cis-acting regulatory SNPs (cis-rSNPs) by genome-wide allele-specific gene expression (ASE) analysis with that of traditional expression quantitative trait locus (eQTL) mapping. Our study included 395 healthy blood donors for whom global gene expression profiles in circulating monocytes were determined by Illumina BeadArrays. ASE was assessed in a subset of these monocytes from 188 donors by quantitative genotyping of mRNA using a genome-wide panel of SNP markers. The performance of the two methods for detecting cis-rSNPs was evaluated by comparing associations between SNP genotypes and gene expression levels in sample sets of varying size. We found that up to 8-fold more samples are required for eQTL mapping to reach the same statistical power as that obtained by ASE analysis for the same rSNPs. The performance of ASE is insensitive to SNPs with low minor allele frequencies and detects a larger number of significantly associated rSNPs using the same sample size as eQTL mapping. An unequivocal conclusion from our comparison is that ASE analysis is more sensitive for detecting cis-rSNPs than standard eQTL mapping. Our study shows the potential of ASE mapping in tissue samples and primary cells which are difficult to obtain in large numbers.
- Discussion
8
- 10.1002/cac2.12317
- Jun 1, 2022
- Cancer Communications
Nasopharyngeal carcinoma (NPC) is a common malignancy in East and Southeast Asia, especially in South China. The etiology of NPC has been linked to genetic susceptibility, Epstein-Barr virus (EBV) infection, and environmental factors. Accumulated evidence including multiple genome-wide association studies (GWASs) has revealed robust genetic predisposition of NPC. However, GWAS-identified genetic variants collectively account for only 8.2% of NPC heritability [1]. The underlying inherited predisposition is largely undetermined. The strongest genetic signal for NPC consistently hits the human leukocyte antigen (HLA) region on 6p21 [2]. However, the highly polymorphic nature and complicated long-range linkage disequilibrium (LD) in the HLA region particularly obscure the causal variants driving the association. In addition, most genetic variants located in introns or intergenic regions. The causal genes mediating genetic effects on NPC risk have rarely been ascertained by GWAS alone. Recently, transcriptome-wide association study (TWAS) has been proposed as an attractive approach to identify novel gene-trait associations and prioritize causal genes for complex traits [3]. By integrating GWAS and gene expression data, TWAS can effectively and economically assess associations between genetically predicted gene expression levels and disease risks in large populations. Hence, using the cis-regulated expression in addition to genetic variants to explore NPC susceptibility genes could be promising and reasonable for mechanistic and functional inference. Nevertheless, neither public data of nasopharyngeal tissue were available, nor TWAS for NPC had been conducted. Herein, we integrated genome and transcriptome data of 89 nasopharyngeal tumor tissues and investigated the associations between predicted gene expression levels and NPC risk using multicenter GWAS data involving 4506 NPC cases and 5384 cancer-free subjects (defined as controls) from South China. Given the close relationship between EBV infection and NPC, a cis-regulated expression weight matrix from EBV-transformed lymphocytes (n = 117) in the GTEx project was used for further evaluation. Study populations and detailed methodology are described in the Supplementary file of methods. We predicted the expression levels of 2505 and 2411 genes in the GWAS population by constructing the models for the prediction of gene expression in nasopharyngeal tissues (NP models) and EBV-transformed lymphocytes (lymphocyte models), respectively (Supplementary Table S1), and 377 genes overlapped (Supplementary Figure S1). Thirty-three genes were associated with NPC at a Bonferroni-corrected threshold, and all were located in the HLA region (Figure 1A). Among them, 11 of 13 previously reported genes were replicated. Our results were consistent with the studies focusing on the HLA region in South China [4, 5], where most of the reported genes available in TWAS were replicated. The predicted expression levels of ZFP57 (NP models), MICA (both models), and HLA-C (lymphocyte models) were significantly higher in cases than in controls, while the expression levels of MOG, HCG27, HLA-DQB1, HLA-H, HLA-U (NP models), HLA-F (both models), HLA-A, and HLA-DRB1 (lymphocyte models) were lower in cases than in controls. The two overlapping genes showed similar associations with NPC (HLA-F: Z score = -10.28 and -8.95; MICA: Z score = 7.82 and 6.60, for NP and lymphocyte models, respectively) (Supplementary Table S2). Interestingly, half of the previously reported genes belonged to HLA class I. Most of them showed lower levels of predicted expression in cases than in controls, possibly because EBV transcripts in NPC tumors were involved in the inhibition of HLA class I gene expression [6]. It is rational to assume that the low expression levels of these genes may affect the anti-EBV immune response in presenting peptides to cytotoxic T cells, facilitating immune evasion of tumor cells or EBV-mediated oncogenic action. TWAS-identified susceptibility genes and pathways for NPC. (A) Manhattan plot of TWAS in NP models and lymphocyte models. The blue lines represent the Bonferroni-corrected significance threshold. The red dots above or below the blue line represent the genes passed the Bonferroni threshold in the association analysis. The genes with green labels have been reported to be associated with NPC by previous genome-wide or candidate pathway association studies. The genes with black labels were newly identified as NPC susceptibility genes by our study. The genes in different chromosomes were exhibited in light and dark grey dots. (B) Expression quantitative trait locus analysis for the seven putative causal genes in the expression data of 89 nasopharyngeal tissue samples. The Kruskal-Wallis test was used to compare medians among three genotypes for most of the variants. In a certain homozygote group, the P values were recalculated using only the wild-type and heterozygous groups for the expression of MICD, HCG27 and HLA-DOB by excluding the groups with a sample size less than 5. (C) GO pathway enrichment analysis of NPC. (D) KEGG pathway enrichment analysis of NPC. "Gene Ratio" refers to the percentage of total significant genes in the given pathway. All 354 significant genes (P < 0.05) in TWAS were used in the enrichment analysis. Abbreviations: TWAS, Transcriptome-wide association analysis; NPC, Nasopharyngeal carcinoma; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes Although the significant signals consistently hit the HLA region, 22 additional genes not previously reported were identified in TWAS. Among them, the predicted expression levels of 9 genes were significantly higher in NPC cases than in controls, including HLA-DOB, HCG4B, RPL23AP1, HLA-J in NP models and HCG4, CCHCR1, STK19, C4B, IFITM4P in lymphocyte models, while 13 other genes showed significantly lower expression levels in cases than in controls, including HCP5, ZSCAN23, HCG4P11, HCG4P7, MICD, MICB-DT, SNHG32 in NP models and NOTCH4, C4A, HCG22, POU5F1, MICE, HLA-S in lymphocyte models (Figure 1A). We performed conditional analyses to determine whether the associations between predicted gene expression levels and NPC were influenced by the GWAS signals. After conditioning on the respective GWAS index SNP, the associations for HLA-DOB, NOTCH4, ZSCAN23, STK19, C4B, HLA-J, HLA-S, and MICB-DT remained significant. After conditioning on all previously reported SNPs, NOTCH4, HCG4, HCG22, POU5F1, HCG4B, HCG4P11, MICB-DT, STK19 and IFITM4P remained significant. It indicated that their associations were partially independent of the GWAS signals (Supplementary Table S3). Due to the complicated structure with high LD and co-expression networks in the HLA region, we conducted fine-mapping analyses to prioritize the causal genes. Using posterior inclusion probability (PIP) analysis, we prioritized 7 causal genes: MICA, HLA-DQB1, HLA-DOB, ZSCAN23, HCG27, MICD, and HLA-U. HLA-DOB, ZSCAN23, and MICD were newly identified as NPC susceptibility genes (Supplementary Table S4). Furthermore, we conducted expression quantitative trait locus (eQTL) analyses to identify whether the genetic variants could influence the expression levels of these genes. We found that individuals with relevant risk SNPs (the GWAS index SNPs) exhibited higher expression of HLA-DQB1, MICA, MICD and HLA-U, or lower expression levels of ZSCAN23, HCG27, and HLA-DOB. These results indicated that the risk alleles affected the expression levels of the causal genes (Figure 1B). Two HLA class II genes (HLA-DQB1 and HLA-DOB) were prioritized as causal genes. Both genes were associated with other virus-associated cancers, such as cervical cancer [7]. A comprehensive TWAS exploring genetic susceptibility for antiviral immune response using 7924 subjects from the UK Biobank cohort revealed that the genetic determinants for EBV infection were predominantly located on HLA class II genes. The most significant signals associated with the antibody level of BamHI Z EBV replication activator (ZEBRA) hit HLA-DQB1 [8]. HLA-DOB may impact viral clearance capacity and persistent infection of hepatitis B virus (HBV) and hepatitis C virus (HCV) [9]. Since EBV reactivation with elevated EBV DNA load or antibodies was observed at the preclinical phase of NPC, we hypothesized that HLA class II genes, especially HLA-DQB1and HLA-DOB, participate in the early stage of NPC tumorigenesis by influencing EBV infection. Besides, some identified pseudogenes, such as IFITM4P [10], may function by regulating their parental genes. However, their biological mechanisms remain unclear, and further researches are needed. Gene Ontology (GO) enrichment analysis confirmed that TWAS-identified genes (354 genes with P < 0.05) were enriched in the pathways of cell-mediated immune response, antigen processing and presentation (Figure 1C). Similarly, the top pathways annotated with the Kyoto Encyclopedia of Genes and Genomes (KEGG) database focused on infection of herpes simplex virus type 1, human T-cell leukemia virus type 1, EBV, and autoimmune disorders such as graft-versus-host disease (Figure 1D). In summary, using a TWAS approach, we corroborated the central role of HLA genes in NPC susceptibility. Apart from HLA class I genes, we propose critical roles of HLA class II genes and other nonclassical HLA genes. Seven genes, including HLA-DQB1 and HLA-DOB, were prioritized as causal genes. Recent evidence indicated that these genes are pivotal in the metastable equilibrium between host and virus. Our findings provide additional evidence for a better understanding of the genetic etiology of NPC and clues to further advance this field. We thank the staffs from Sun Yat-sen University Cancer Center biorepository. We thank all the study participants and research staff who recruited participants and collected samples in this study. This study was funded by the National Key Research and Development Program of China (2021YFC2500400), the Basic and Applied Basic Research Foundation of Guangdong Province, China (2021B1515420007), Sino-Sweden Joint Research Programme (81861138006), the Science and Technology Planning Project of Guangzhou, China (201804020094), the Special Support Program for High-level Professionals on Scientific and Technological Innovation of Guangdong Province, China (2014TX01R201), National Natural Science Foundation of China (81973131, 81903395, 81803319, 82003520), National Science Fund for Distinguished Young Scholars of China (81325018). The authors have no potential conflicts of interest to declare. The Institutional Review Board of Sun Yat-sen University Cancer Center approved this study. Informed consent was obtained from all study participants. WHJ and YQH devised the project and the main conceptual ideas; YQH, WQX, DHL, and TMW wrote the original draft; DHL and TMW performed the computational analyses; TMW, DWY, CMD, and WLZ contributed to implementation of data processing and analyses; DWY, CMD, YL, WLZ, RWX, LL, HD, XT, YW, TZ, XZL, PFZ, XHZ, SDZ, YZH, MT, YZ, YC and JBZ contributed to the sample preparation; TMW and WLZ contributed to the RNA-seq quantification and quality control pipeline; ETC, ZZ, GH, SMC, QL, LF, YS, MLL, HOA, WY, and THL contributed to the interpretation of the results; YQH, WQX, and TMW revised and wrote the final version of the manuscript; verified the analytical methods; WHJ supervised the project. All authors read and approved the final manuscript. Methods and materials are available in the supplementary file. The datasets generated and used during the current study are available at Research Data Deposit (RDD) public platform (www.researchdata.org.cn) with the approval RDD number of RDDB2021406340. Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.
- Research Article
141
- 10.1101/gr.209759.116
- Oct 19, 2016
- Genome Research
Gene-by-environment (GxE) interactions determine common disease risk factors and biomedically relevant complex traits. However, quantifying how the environment modulates genetic effects on human quantitative phenotypes presents unique challenges. Environmental covariates are complex and difficult to measure and control at the organismal level, as found in GWAS and epidemiological studies. An alternative approach focuses on the cellular environment using in vitro treatments as a proxy for the organismal environment. These cellular environments simplify the organism-level environmental exposures to provide a tractable influence on subcellular phenotypes, such as gene expression. Expression quantitative trait loci (eQTL) mapping studies identified GxE interactions in response to drug treatment and pathogen exposure. However, eQTL mapping approaches are infeasible for large-scale analysis of multiple cellular environments. Recently, allele-specific expression (ASE) analysis emerged as a powerful tool to identify GxE interactions in gene expression patterns by exploiting naturally occurring environmental exposures. Here we characterized genetic effects on the transcriptional response to 50 treatments in five cell types. We discovered 1455 genes with ASE (FDR < 10%) and 215 genes with GxE interactions. We demonstrated a major role for GxE interactions in complex traits. Genes with a transcriptional response to environmental perturbations showed sevenfold higher odds of being found in GWAS. Additionally, 105 genes that indicated GxE interactions (49%) were identified by GWAS as associated with complex traits. Examples include GIPR–caffeine interaction and obesity and include LAMP3–selenium interaction and Parkinson disease. Our results demonstrate that comprehensive catalogs of GxE interactions are indispensable to thoroughly annotate genes and bridge epidemiological and genome-wide association studies.
- Research Article
26
- 10.1093/gbe/evx080
- May 1, 2017
- Genome Biology and Evolution
Polymorphism in cis-regulatory sequences can lead to different levels of expression for the two alleles of a gene, providing a starting point for the evolution of gene expression. Little is known about the genome-wide abundance of genetic variation in gene regulation in natural populations but analysis of allele-specific expression (ASE) provides a means for investigating such variation. We performed RNA-seq of multiple tissues from population samples of two closely related flycatcher species and developed a Bayesian algorithm that maximizes data usage by borrowing information from the whole data set and combines several SNPs per transcript to detect ASE. Of 2,576 transcripts analyzed in collared flycatcher, ASE was detected in 185 (7.2%) and a similar frequency was seen in the pied flycatcher. Transcripts with statistically significant ASE commonly showed the major allele in >90% of the reads, reflecting that power was highest when expression was heavily biased toward one of the alleles. This would suggest that the observed frequencies of ASE likely are underestimates. The proportion of ASE transcripts varied among tissues, being lowest in testis and highest in muscle. Individuals often showed ASE of particular transcripts in more than one tissue (73.4%), consistent with a genetic basis for regulation of gene expression. The results suggest that genetic variation in regulatory sequences commonly affects gene expression in natural populations and that it provides a seedbed for phenotypic evolution via divergence in gene expression.
- Research Article
9
- 10.1038/s41598-021-83459-8
- Feb 17, 2021
- Scientific Reports
Differential abundance of allelic transcripts in a diploid organism, commonly referred to as allele specific expression (ASE), is a biologically significant phenomenon and can be examined using single nucleotide polymorphisms (SNPs) from RNA-seq. Quantifying ASE aids in our ability to identify and understand cis-regulatory mechanisms that influence gene expression, and thereby assist in identifying causal mutations. This study examines ASE in breast muscle, abdominal fat, and liver of commercial broiler chickens using variants called from a large sub-set of the samples (n = 68). ASE analysis was performed using a custom software called VCF ASE Detection Tool (VADT), which detects ASE of biallelic SNPs using a binomial test. On average ~ 174,000 SNPs in each tissue passed our filtering criteria and were considered informative, of which ~ 24,000 (~ 14%) showed ASE. Of all ASE SNPs, only 3.7% exhibited ASE in all three tissues, with ~ 83% showing ASE specific to a single tissue. When ASE genes (genes containing ASE SNPs) were compared between tissues, the overlap among all three tissues increased to 20.1%. Our results indicate that ASE genes show tissue-specific enrichment patterns, but all three tissues showed enrichment for pathways involved in translation.
- Conference Article
- 10.3920/978-90-8686-940-4_496
- Dec 31, 2022
Allele-specific expression (ASE) analysis improves the understanding of transcription’s cis-regulation. Herein, we used imputed SNPs along with RNA-Seq data from the Longissiumus thoracis muscle of 190 Nelore steers to identify functional cis-regulatory variants from ASE analysis. Using a Binomial Test, we identified 38,177 SNPs in ASE regions (ASE SNPs; FDR ≤0.05). We then searched for aseQTLs (SNPs potentially regulating the ASE) by comparing their heterozygosity to the measured allelic ratio under a Wilcoxon Rank Sum test. We identified 21,543 aseQTLs potentially regulating a total of 430 ASE SNPs (FDR ≤0.05). Based on a linear model, ASE SNPs and aseQTLs were associated with transcript abundance. We identified 3,333 SNPs acting as cis-eQTLs (FDR≤0.05). Results were integrated with previous ASE, functional regions, and meat quality-related differentially expressed genes data. This study described novel SNPs potentially regulating the transcription of genes that may affect beef traits.
- Research Article
1
- 10.1186/s13287-025-04657-z
- Sep 25, 2025
- Stem cell research & therapy
Tissue engineering technology has limited application in bone tissue regeneration because the mechanism remains unclear. SUV39H1 is a well-characterized histone methyltransferase, but its specific role in bone regeneration of dental pulp stem cells(DPSCs) remains unclear. Mitochondrial energy metabolism plays a regulatory role in osteogenesis, with lipid metabolites serving as critical substrates to fuel this process. FASN has been established as a key regulator of fatty acid metabolism. Therefore, we speculate that SUV39H1 influences the osteogenic differentiation of DPSCs through the mediation of FASN. However, it is still unclear how to regulate the expression of the SUV39H1. Alkaline phosphatase activity and alizarin red staining were used to detect the osteogenic differentiation of DPSCs. Real-time reverse transcription polymerase chain reaction (RT-PCR) and Western blot were performed to detect gene expression levels. Cranioparietal bone replantation in rats and subcutaneous replantation in nude mice were used to confirm bone tissue regeneration. The Seahorse Cell Mitochondria Stress Test was used to detect the oxygen consumption rate. Co-Immunoprecipitation and GST pull-down confirmed the proteins complex. Lipid metabolism sequencing was used to detect the lipid metabolites. Software-based prediction tools analyze gene conservation and interaction networks. Dual-luciferase Reporter Gene Assay was used to detect SUV39H1 regulation by miRNA. SUV39H1 promoted osteogenic differentiation and bone regeneration in DPSCs. Our results further demonstrated that SUV39H1 enhanced the osteogenic differentiation of DPSCs by promoting lipid metabolism and subsequent mitochondrial energy metabolism. Upon exploring the mechanism by which SUV39H1 regulates lipid metabolism and mitochondrial function. SUV39H1 was found to bind to non-histone and methylated FASN. Simultaneously, FASN was degraded by ubiquitination after SUV39H1 combined with FASN. Thus, SUV39H1 was speculated to methylate FASN, and subsequently recruit a ubiquitination enzyme targeting FASN for degradation. This process modulated lipid and mitochondrial energy metabolism to facilitate the bone regeneration of DPSCs. Regarding the mechanism of regulating SUV39H1 expression, miR-4788 bound to the 3 'UTR of SUV39H1 was found to silence its expression. Overall, SUV39H1 facilitated the osteogenic differentiation of DPSCs by modulating lipid metabolism and affected mitochondrial energy metabolism through FASN via non-histone methylation and ubiquitination mechanisms. The expression of SUV39H1was regulated by miR-4788.
- Research Article
- 10.1161/atvb.37.suppl_1.200
- May 1, 2017
- Arteriosclerosis, Thrombosis, and Vascular Biology
Single nucleotide polymorphisms (SNPs) in the human 8q24 locus have repeatedly been associated by genome-wide association studies (GWAS) with multiple human metabolic traits such as plasma lipids and coronary artery disease. These SNPs lie in a non-coding region ~30kb downstream of the gene Tribbles-1 ( TRIB1 ). While a large body of in vivo evidence from Trib1 gain- and loss-of-function mouse models strongly supports TRIB1 as the gene of interest at this locus, there has been no demonstrated association between SNPs in the GWAS region and the expression of the neighboring TRIB1 gene to date. To address this, we performed RNA-seq and genome-wide genotyping on 42 human cadaveric liver samples, and next performed allele-specific expression (ASE) analysis of the TRIB1 locus in 23 samples that harbored heterozygous coding SNPs in the TRIB1 gene, allowing for allelic discrimination. While subjects homozygous for the major allele of the lead GWAS SNP (rs2954029) had even allelic expression ( p =0.29 by non-parametric t-test), samples heterozygous for the minor allele at rs2954029 exhibited imbalanced allelic expression ( p <0.0005). This finding suggests that SNPs in the TRIB1 locus do affect TRIB1 gene expression; however, it remains unclear which SNP(s) is causal. To that end, we used ENCODE data from primary hepatocytes and HepG2 cells to identify 10 genomic regions of interest that contained SNPs with significant GWAS p-values and epigenetic markers consistent with enhancer activity. Two regions (R2, R9) exhibited strong enhancer activity when cloned in front of a minimal promoter driving a luciferase reporter (pGL4.23), increasing luciferase activity 2-5-fold as compared to empty vector ( p <0.05). Introduction of the minor alleles at three separate candidate SNPs in R2 reduced enhancer activity. In summary, we show that common variation in the 8q24 GWAS locus does affect TRIB1 gene expression, as measured by ASE in human livers. We also identified multiple enhancer elements that exhibit reduced activity when the minor alleles of GWAS SNPs are introduced into them. We are currently pursuing modification of the endogenous locus in HepG2 cells via CRISPR/Cas genome editing to further elucidate the roles of both these enhancer elements and the SNPs contained in them.