Abstract

Article Figures and data Abstract Editor's evaluation Introduction Results Discussion Materials and methods Data availability References Decision letter Author response Article and author information Metrics Abstract Systems genetics holds the promise to decipher complex traits by interpreting their associated SNPs through gene regulatory networks derived from comprehensive multi-omics data of cell types, tissues, and organs. Here, we propose SpecVar to integrate paired chromatin accessibility and gene expression data into context-specific regulatory network atlas and regulatory categories, conduct heritability enrichment analysis with genome-wide association studies (GWAS) summary statistics, identify relevant tissues, and estimate relevance correlation to depict common genetic factors acting in the shared regulatory networks between traits. Our method improves power upon existing approaches by associating SNPs with context-specific regulatory elements to assess heritability enrichments and by explicitly prioritizing gene regulations underlying relevant tissues. Ablation studies, independent data validation, and comparison experiments with existing methods on GWAS of six phenotypes show that SpecVar can improve heritability enrichment, accurately detect relevant tissues, and reveal causal regulations. Furthermore, SpecVar correlates the relevance patterns for pairs of phenotypes and better reveals shared SNP-associated regulations of phenotypes than existing methods. Studying GWAS of 206 phenotypes in UK Biobank demonstrates that SpecVar leverages the context-specific regulatory network atlas to prioritize phenotypes’ relevant tissues and shared heritability for biological and therapeutic insights. SpecVar provides a powerful way to interpret SNPs via context-specific regulatory networks and is available at https://github.com/AMSSwanglab/SpecVar, copy archived at swh:1:rev:cf27438d3f8245c34c357ec5f077528e6befe829. Editor's evaluation In this article, the authors develop a method to identify potentially causal tissues and cell types for complex diseases by performing heritability enrichment estimation using information from gene regulatory networks. This article is of significant interest to geneticists and biologists interested in unraveling the molecular basis of disease. The key claims of the article are well supported by the data. The work has the potential to inform our understanding of the genetics of complex diseases. https://doi.org/10.7554/eLife.82535.sa0 Decision letter Reviews on Sciety eLife's review process Introduction Genome-wide association studies (GWAS) have gained a great success to identify thousands of genetic variants significantly associated with a variety of human complex phenotypes. Interpretation of those genetic variants holds the key to biological mechanism discovery and personalized medicine practice. However, this task is hindered by the genetic architecture that the heritability is distributed across SNPs of the whole genome with linkage disequilibrium, which cumulatively affect complex traits. By quantifying the contribution of true polygenic signal considering linkage disequilibrium, LD Score regression (LDSC) provides a widely appreciated method to estimate heritability (Bulik-Sullivan et al., 2015b) and genetic correlation (Bulik-Sullivan et al., 2015a) from GWAS summary statistics. Another obstacle to genetic variant interpretation is that SNPs contribute to phenotype through gene regulatory networks in certain cellular contexts, that is, causal tissues or cell types. Those tissues are characterized by different types of epigenetic data, which give the active regions of the genome that interact with transcription factors (TFs) to regulate gene expression. Stratified LDSC extends LDSC and can estimate the partitioned heritability enrichment in the functional categories (Finucane et al., 2015). The categories can be nonspecific genome annotations (such as coding, UTR, promoter, and intronic regions) and context-specific regulatory regions called from chromatin data of different cell types, such as DNase-I hypersensitive sites from DNase-seq data, accessible peaks from ATAC-seq data, histone marker, or TF binding sites from ChIP-seq data (AAP and SAP). Using expression data, the functional categories can be alternatively constructed by the 100 kb windows around the transcribed regions of specifically expressed genes (SEGs) (Finucane et al., 2018). Essentially, these strategies summarize the high-dimensional SNP signals from the whole genome into partitioned heritability enrichments of functional categories and successfully identify relevant cellular tissues for many phenotypes (Finucane et al., 2015). The rapid increase of multi-modal data resources, especially matched gene expression, chromatin states, and TF binding sites (i.e., measured on the same sample), offers an exciting opportunity to construct better functional categories for estimating context-specific heritability enrichment. One efficient way is to integrate large-scale epigenomic and transcriptomic data spanning diverse human contexts to infer regulatory networks (Duren et al., 2017). Those regulatory networks provide rich context-specific information and usually comprise TFs, regulatory elements (REs), and target genes (TGs). Recently, we developed the PECA2 model to infer regulatory network from paired expression and chromatin accessibility data (Duren et al., 2017; Duren et al., 2020). The inferred regulatory networks have been used to identify the master regulators in stem cell differentiation (Li et al., 2019) and interpret conserved regions for the nonmodel organisms (Xin et al., 2020). Noncoding genetic variants can be interpreted in the regulatory networks on how they cooperatively affect complex traits through gene regulation in certain tissues or cell types. For example, genetic variants in the regulatory network of cranial neural crest cells (CNCC) are elucidated on how they affect human facial morphology (Feng et al., 2021). Regulatory networks can help identify two kinds of relevant cell types to COVID19 severity (Feng et al., 2022b). RSS-NET utilizes gene regulatory networks of multiple contexts and shows better tissue enrichment estimation by decomposing the total effect of an SNP through TF-TG regulations (Zhu et al., 2021) and HiChIP RE-TG regulations (Ma et al., 2022). The phenotype-associated SNPs are revealed to be functional in a tissue- or cell-type-specific manner (Westra and Franke, 2014). The advances in constructing regulatory networks and interpreting genetic variants with regulatory networks enlighten us to (1) assemble a more comprehensive context-specific regulatory network atlas by using paired expression and accessibility data across diverse cellular contexts; (2) build context-specific regulatory categories by focusing on RE’s specificity compared with other contexts; and (3) systematically identify enriched tissues or cell types, relevance correlation, and the underlying SN-associated regulations. Specifically, we propose SpecVar to first leverage the publicly available paired expression and chromatin accessibility data in ENCODE and ROADMAP to systematically construct context-specific regulatory networks of 77 human contexts, which cover major cell types and germ layer lineages. This atlas serves as a valuable resource for genetic variants interpretation in multicellular contexts. SpecVar then constructs regulatory categories in the genome with this atlas, which can significantly improve the heritability enrichment. Based on the heritability enrichment and p-value in our regulatory categories, SpecVar defines the relevance score to give the context-specific representation of the GWAS. In this article, we use six well-studied phenotypes, large-scale facial morphology, and UKBB phenotypes to show that, for a single phenotype, the relevance score of SpecVar can identify relevant tissues more efficiently; and for multiple phenotypes, SpecVar can reveal relevance correlation by common relevant tissues, and underlying shared SNP-associated regulations. Compared to the existing methods, SpecVar shows novelty in three aspects: (1) SpecVar integrates paired gene expression data and chromatin, which are two types of easily accessible data with rich information, into regulatory networks. The gene expression and chromatin accessibility in the regulatory network are complementary to each other to reveal the high-quality active regulatory elements and genes for certain tissues or cell types to interpret genetic variants; (2) SpecVar highlights the comparison with other contexts by specificity to narrow down the regulatory molecules; and (3) SpecVar is more interpretable because it can explain the relevance to tissues by SNP-associated regulatory networks and interpret phenotype correlation through common relevant tissues and shared SNP-associated regulatory network. These results show that SpecVar serves as a promising tool for post-GWAS analysis. Results Overview of the SpecVar method SpecVar assembled a context-specific regulatory network atlas and built the context-specific representation (relevance score and SNP-associated regulatory network) of GWAS summary statistics based on heritability enrichment. Figure 1 summarizes the major steps of SpecVar to construct the context-specific regulatory network atlas and regulatory categories, calculate heritability enrichment and SNP-associated regulatory network, and investigate interpretable relevant tissues and relevance correlation. Figure 1 with 1 supplement see all Download asset Open asset Overview of SpecVar. (a) SpecVar constructs an atlas of context-specific regulatory networks and regulatory categories. Then SpecVar represents genome-wide association studies (GWAS) summary statistics into relevance scores and SNP-associated regulatory subnetworks. (b) For a single phenotype, SpecVar can use relevance score and SNP-associated regulatory subnetworks to identify and interpret relevant tissues. (c) For multiple phenotypes, based on relevance score, SpecVar can reveal relevance correlation, common relevant tissues, and shared regulations. We first constructed regulatory networks of M (M = 77 in this work) contexts. Each network is represented by a set of relations between TF and RE and between RE and TG. The M contexts included samples from all three germ layers, such as ‘frontal cortex’ (ectoderm), ‘fetal thymus’ (mesoderm), and ‘body of pancreas’ (endoderm), which ensured the wide coverage and system-level enrichment (Figure 1—figure supplement 1). The context-specific regulatory networks were extracted based on the specificity of REs compared with other contexts, considering the hierarchical relationship of M contexts (Materials and methods, Supplementary file 1a). The REs in the ith context-specific regulatory network were pooled to form a regulatory category Ci in the genome, which restricted the annotation to context-specific REs associated with active binding TFs and nearby regulated TGs (Figure 1a). Our atlas led to M regulatory categories, C1,C2,…,CM of SpecVar. Given GWAS summary statistics, the M regulatory categories allowed partitioned heritability enrichment analysis by stratified LDSC. For a phenotype, stratified LDSC modeled genome-wide polygenic signals, partitioned SNPs into categories with different contributions for heritability, and considered SNP’s linkage disequilibrium with the following polygenic model: (1) Eχj2=N∑iτilj,i+Na+1 Here, χj2 is the marginal association of SNP j from GWAS summary statistics; N is the sample size; lj,i=∑k∈Cirjk2 is the LD score of SNP j in the ith regulatory category Ci , where rjk is the genotype correlation between SNP j and SNP k in population; a measures the contribution of confounding biases; and τi represents the heritability enrichment of SNPs in Ci . Stratified LDSC estimated the p value pi for the heritability enrichment τi by block Jackknife (Finucane et al., 2015). We defined the relevance score (Ri) of this phenotype to ith context (Figure 1a) as follows by combining the heritability enrichment and statistical significance (p-value): (2) Ri=τi∙-log⁡pi The relevance score (R score) provided a decision trade-off between the heritability enrichment and p-value resulting from a hypothesis test. It offered a robust means to rank and select relevant tissues for a given phenotype (Xiao et al., 2014). Meanwhile, SpecVar associated SNPs with context-specific regulatory networks for biological interpretation. We defined the association score (A score) to prioritize the REs by combining its regulatory strength and association significance with the phenotype (averaged -log(p-value) of SNPs located near the RE and downweighted by their LD scores and distances to this RE). We extracted the REs with significant A scores (FDR≤0.05), as well as their directly linked upstream TFs, downstream TGs, and associated SNPs, to form the SNP-associated regulatory subnetwork (Figure 1a, Materials and methods). Given GWAS summary statistics of a phenotype, SpecVar obtained M SNP-associated regulatory subnetworks, G1,G2,…,GM , allowing us to interpret relevant tissues by SNP’s regulation mechanism. The relevance score to diverse human contexts and SNP-associated regulatory networks allowed SpecVar to perform post-GWAS analysis. For a single phenotype, the R scores indicated the relevance of this phenotype to M contexts, which can be used to identify relevant tissues. Then in the relevant tissues, we can investigate the SNP-associated regulatory subnetwork to interpret the relevance (Figure 1b, Materials and methods). For multiple phenotypes, we can correlate the R score vectors in multiple contexts to define relevance correlation (Finucane et al., 2018). The relevance correlation might give insights into the association of phenotypes since SpecVar can further interpret the relevance correlation between two phenotypes by common relevant tissues and the shared SNP-associated regulatory subnetwork (Figure 1c, Materials and methods). Context-specific regulatory networks improve heritability enrichment We first designed experiments to show that context-specific regulatory networks could improve heritability enrichment. We collected GWAS summary statistics of six phenotypes, including two lipid phenotypes (Willer et al., 2013): low-density lipoprotein (LDL) and total cholesterol; two human intelligential phenotypes (Lee et al., 2018): educational attainment and cognitive performance; and two craniofacial bone phenotypes: brain shape (Naqvi et al., 2021) and facial morphology (Xiong et al., 2019). We used these six phenotypes as a benchmark since their relevant tissues have been previously studied and partially known: lipid phenotypes are associated with the liver for its key role in lipid metabolism (Nguyen et al., 2008); human intelligential phenotypes are associated with brain tissues (Goriounova and Mansvelder, 2019); facial morphology and brain shape have shared heritability in cranial neural crest cells (Naqvi et al., 2021). We compared our context-specific regulatory networks with four alternative methods of functional categories: all regulatory elements (ARE), all accessible peaks (AAP), specifically accessible peaks (SAP) (Finucane et al., 2015), and specifically expressed genes (Finucane et al., 2018) (SEG) (Materials and methods). First, we showed that SpecVar could achieve higher heritability enrichment in the relevant tissues than other methods (Supplementary file 1b). For LDL, SpecVar obtained a heritability enrichment of 678.91 in the ‘right lobe of liver’, while ARE, SAP, AAP, and SEG gave heritability enrichment of 113.34, –42.09, 50.95, and 4.47, respectively. We conducted Welch’s t-test to assess the significance of the difference between SpecVar and other methods and found that the heritability enrichment of SpecVar was significantly higher than ARE (p=6.9×10−4), SAP (p=1.4×10−4), AAP (p=3.4×10−4), and SEG (p=2.1×10−4) (Figure 2a). For total cholesterol, SpecVar also gave significantly higher heritability enrichment in ‘right lobe of liver’ than ARE (p=5.7×10−4), SAP (p=4.4×10−5), AAP (p=1.6×10−4), and SEG (p=7.7×10−5) (Figure 2b). For educational attainment and cognitive performance, they were relevant to brain tissues: ‘frontal cortex’, ‘cerebellum’, ‘caudate nucleus’, ‘Ammon’s horn’, and ‘putamen’. SpecVar obtained the highest averaged heritability enrichment in brain tissues among these methods (Figure 2—figure supplement 1a and b). In the ‘frontal cortex’, SpecVar had significantly higher heritability enrichment than ARE (p=1.2×10−5), SAP (p=2.0×10−6), AAP (p=3.0×10−6), and SEG (p=3.0×10−6) for educational attainment (Figure 2c). And for cognitive performance in ‘frontal cortex’, SpecVar also had significantly higher heritability enrichment than ARE (p=9.0×10−6), SAP (p=2.0×10−6), AAP (p=1.0×10−6), and SEG (p=1.0×10−6) (Figure 2d). For brain shape, SpecVar obtained a significantly higher heritability enrichment in its relevant context ‘CNCC’ than the other four methods (ARE p=5.9×10−4 , SAP p=6.7×10−4 , AAP p=7.5×10−4 , and SEG p=8.1×10−5 , Figure 2e). For facial morphology, SpecVar also gave a much higher heritability enrichment in ‘CNCC’ than the other four methods (ARE p=9.0×10−6 , SAP p=1.0×10−6 , AAP p=7.0×10−6 , and SEG p=1.0×10−6 , Figure 2e). Second, except for the known relevant tissues, these complex traits may be relevant to other contexts. So, for every method, we ranked the heritability enrichment to get the top 10 contexts and used these top contexts’ heritability enrichment to compare the ability of these five methods to explain heritability. SpecVar also showed the best performance of heritability enrichment among the five methods (Figure 2g). Taking brain shapeas an example, SpecVar achieved significantly higher heritability enrichment (averaged heritability enrichment 96.13) than ARE (averaged heritability enrichment 26.77, t-test p=3.4×10−3), SAP (42.92, p=1.9×10−2), AAP (20.34, p=1.8×10−3), and SEG (2.25, p=3.1×10−4). We found that specificity could significantly improve the heritability enrichment. Among the five methods we compared, SpecVar and SAP are based on the specificity of ARE and AAP, respectively. SpecVar showed significantly higher heritability enrichment than ARE and SAP showed significantly higher heritability enrichment than AAP (Figure 2g). For brain shape, SpecVar obtained averaged heritability enrichment of 96.31 of the top 10 contexts, which was significantly higher than ARE (averaged heritability enrichment 26.77, p=3.4×10−3); SAP obtained average heritability enrichment of 42.92, and AAP’s averaged heritability enrichment was 20.34 (p=2.7×10−3). The other five phenotypes showed a similar improvement (Figure 2g). Figure 2 with 2 supplements see all Download asset Open asset Comparison of heritability enrichment between SpecVar and four alternate methods: all regulatory elements (ARE), all accessible peaks (AAP), specifically accessible peaks (SAP), and specifically expressed genes (SEG). (a) The heritability enrichment of low-density lipoprotein (LDL) in the ‘right lobe of liver’. (b) The heritability enrichment of total cholesterol in the ‘right lobe of liver’. (c) The heritability enrichment of educational attainment in the ‘frontal cortex’. (d) The heritability enrichment of cognitive performance in the ‘frontal cortex’. (e) The heritability enrichment of brain shape in cranial neural crest cell (CNCC). (f) The heritability enrichment of facial morphology in ‘CNCC’. The sample size of error bars for (a-f) is 200. (g) Boxplot of top 10 contexts’ heritability enrichment of six phenotypes for five methods. To explore the heritability enrichment improvement of SpecVar, we conducted ablation analysis to study the contribution of two important parts of SpecVar: (1) regulatory network by integrating gene expression and chromatin accessibility data, and (2) specificity by comparing with other contexts. Figure 2—figure supplement 1c shows the relationship and difference of the five methods: SEG is the combination of gene expression and specificity; AAP is only from chromatin accessibility; SAP is the combination of chromatin accessibility and specificity; ARE integrates gene expression and chromatin accessibility; and SpecVar considers integration of gene expression, chromatin accessibility, and specificity. To analyze the effect of the regulatory network in heritability enrichment, we compared SpecVar with SAP and showed the effect of gene expression data. We compared SpecVar with SEG and showed the effect of chromatin accessibility data. We compared SpecVar with ARE and showed the contribution of specificity (Figure 2—figure supplement 1d). To quantify the effect of each component, we caculated the fold change of different methods’ heritability enrichments. We found that chromatin accessibility, which was part of the regulatory network, showed the highest effect in improving heritability enrichments for all six phenotypes. This is consistent with the fact that most genetic variants are located in the noncoding regulatory regions (Claussnitzer et al., 2015; Kumar et al., 2012; Smemo et al., 2014) and chromatin accessibility gives the direct functional evidence for genetic variants. The specificity in SpecVar also contributed at least four-fold improvement in heritability enrichment (Figure 2—figure supplement 1e). In summary, the experiment on six phenotypes’ GWAS summary statistics proved that SpecVar achieved the best performance in explaining the heritability of phenotypes. These results demonstrated the power of integrating expression and chromatin accessibility data and considering contexts’ specificity. SpecVar can accurately reveal relevant tissues for phenotypes After establishing that SpecVar could use the context-specific regulatory networks to improve heritability enrichment, we next showed that for a given phenotype, SpecVar could use R scores to identify relevant tissues more accurately than other methods. In this experiment, we also used the above six phenotypes with their known relevant tissues as a benchmark and first compared SpecVar with the other two specificity-based methods: SAP and SEG (Materials and methods). For two lipid phenotypes, SpecVar revealed that both LDL and total cholesterol were most relevant to the ‘right lobe of liver’ (Figure 3a and b, Table 1), which was consistent with the existing reports that the liver plays a central role in lipid metabolism, serving as the center for lipoprotein uptake, formation, and export to the circulation (Jha et al., 2018; Nguyen et al., 2008). SpecVar found that LDL and the total cholesterol were also significantly relevant to the ‘fetal adrenal gland’ and the adrenal cortex has been revealed to play an important role in lipid mentalism (Boyd et al., 1983). However, SAP and SEG failed to prioritize liver tissue as the significant relevant tissue. For LDL, SAP identified the ‘frontal cortex’ to be the most relevant tissue. SEG identified the most relevant tissue to be ‘HepG2’, which was human hepatoma cell lines, but the relevance score was relatively lower (Figure 3a, Supplementary file 1c). For total cholesterol, SAP identified the ‘fetal adrenal gland’ and SEG obtained ‘HepG2’ as the most relevant tissues (Figure 3b, Supplementary file 1c). Figure 3 with 5 supplements see all Download asset Open asset Comparison of identifying proper relevant tissues between SpecVar and two other LD Score regression (LDSC)-based method: specifically accessible peaks (SAP) and specifically expressed genes (SEG). The top five relevant tissues ranked by the relevant score of SpecVar, SAP, and SEG for (a) low-density lipoprotein (LDL), (b) total cholesterol, (c) educational attainment, (d) cognitive performance, (e) brain shape, and (f) facial morphology. The sample size of error bars for (a-f) is 100. Table 1 The total sample size, number of significant SNP associations, and SpecVar-identified relevant tissues of six phenotypes. For each relevant tissue, we have two numbers in the bracket: the first is the R score and the second is its false discovery rate (FDR) q-value. TraitSample sizeSNP associationRelevant tissues (R score and its FDR q-value)Low-density lipoprotein173,0823077Right lobe of liver (722.74, 1.2e-3), frontal cortex (146.54, 3.3e-4), gastrocnemius medialis (128.02, 4.7e-4), fetal adrenal gland (123.52, 9.5e-4)Total cholesterol187,3654169Right lobe of liver (714.75, 1.0e-2), fetal adrenal gland (216.74, 4.3e-17), H7-hESC (130.73, 2.1e-3)Educational attainment1070,75130,519Frontal cortex (167.23, 3.7e-7)Cognitive performance257,84113,732Frontal cortex (216.62, 7.0e-28),Ammon’s horn (107.25, 2.3e-10)Brain shape19,64438,630CNCC (512.56, 2.7e-44), trophoblast (144.15, 3.8e-9)Facial morphology10,115495CNCC (134.95, 8.0e-10), fibroblast (128.81, 3.8e-26) CNCC, cranial neural crest cell. For two human intelligential phenotypes, SpecVar prioritized the ‘frontal cortex’ to be the most relevant tissue for both educational attainment and cognitive performance (Figure 3c and d, Table 1). ‘Frontal cortex’ is the cerebral cortex covering the front part of the frontal lobe and is implicated in planning complex cognitive behavior, personality expression, decision-making, and moderating social behavior (Gabrieli et al., 1998; Yang and Raine, 2009). There were five tissues (‘frontal cortex’, ‘Ammon’s horn’, ‘cerebellum’, ‘putamen’, and ‘caudate nucleus’) from the brain in our atlas and they were significantly higher ranked by SpecVar’s relevance score than nonbrain tissues for educational attainment (Wilcoxon rank-sum test, p=6.1×10−7 , Figure 3c) and cognitive performance (p=8.0×10−6, Figure 3d). In comparison, for educational attainment, SAP prioritized brain tissues to be higher ranked than nonbrain tissues, but with a less significant p-value (p=2.3×10−3, Figure 3c, Supplementary file 1c). SEG could not rank brain tissues to be higher than nonbrain tissues (p=6.4×10−1, Figure 3c, Supplementary file 1c). For cognitive performance, SAP failed to rank brain tissues as the more relevant tissues (p=6.0×10−2, Figure 3d, Supplementary file 1c), and SEG identified brain tissues to be more relevant than nonbrain tissues but with a less significant p-value (P=3.2×10−3, Figure 3d, Supplementary file 1c). For both facial morphology and brain shape, SpecVar identified CNCC as the most relevant context (Figure 3e and f, Table 1). CNCC is a migratory cell population in early human craniofacial development that gives rise to the peripheral nervous system and many non-neural tissues such as smooth muscle cells, pigment cells of the skin, and craniofacial bones, which make it much more related to facial morphology and brain shape than the other 76 contexts (Cordero et al., 2011; Barlow et al., 2008). Facial morphology and brain shape were also revealed to share heritability in CNCC (Naqvi et al., 2021). But the other two methods failed to identify CNCC as the most relevant context. For brain shape, SAP identified ‘H1-hESC’ and SEG identified ‘tibial nerve’ to be the most relevant tissue (Figure 3e, Supplementary file 1c). For facial morphology, SAP and SEG identified ‘foreskin’ and ‘sigmoid colon’ to be the most relevant tissues, respectively (Figure 3f, Supplementary file 1c). We next compared with other relevant tissue identification methods that were not based on LDSC. First, we compared SpecVar to CoCoNet, which was based on gene co-expression networks. CoCoNet is built with 38 tissues’ co-expression networks from GTEx, and we applied it to our six phenotypes. We could see CoCoNet identified ‘Breast’ as the most relevant tissue for LDL (Figure 3—figure supplement 1a) and ‘Brain_other’ as the most relevant tissue for total cholesterol (Figure 3—figure supplement 1b). ‘Breast’ was the most relevant tissue for educational attainment (Figure 3—figure supplement 1c), and ‘Stomach’ was the most relevant tissue for cognitive performance (Figure 3—figure supplement 1d). Since there is no CNCC sample in GTEx, CoCoNet revealed ‘Prostate’ as the most relevant tissue for brain shape and facial morphology. These results seemed less reasonable than SpecVar because CoCoNet did not identify liver tissues for LDL and total cholesterol and did not reveal brain tissues for educational attainment and cognitive performance. We also compared with RolyPoly, which was a non-network-based method for discovering relevant tissues. We fitted the RolyPoly model with gene expression profiles of our 77 human contexts and applied it to GWAS of LDL, total cholesterol, educational attainment, cognitive performance, and facial morphology. RolyPoly prioritized the ‘HepG2’ cell line as the most relevant tissue for LDL and total cholesterol (Figure 3—figure supplement 2a and b). ‘HepG2’ is also reasonable to be relevant to lipid phenotypes because it is the nontumorigenic cell with high proliferation rates and epithelial-like morphology that performs many differentiated hepatic functions. For educational attainment, RolyPoly did not identify the five brain tissues as the top-ranked tissue and only include ‘fetal spinal cord’ in the top five relevant tissues (Figure 3—figure supplement 2c). For cognitive performance, there were no brain tissues in the top five tissues (Figure 3—figure supplement 2d). And for facial morphology, RolyPoly failed to identify ‘CNCC’ as relevant tissues (Figure 3—figure supplement 2e). The comparison with two non-LDSC-based methods again showed the superiority of SpecVar to identify proper relevant tissues. After identifying the relevant tissues, SpecVar could further interpret the relevance by extracting SNP-associated regulatory subnetwork (Materials and

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call