Abstract

BackgroundIdentification of functional non-coding variants and their mechanistic interpretation is a major challenge of modern genomics, especially for precision medicine. Transcription factor (TF) binding profiles and epigenomic landscapes in reference samples allow functional annotation of the genome, but do not provide ready answers regarding the effects of non-coding variants on phenotypes. A promising computational approach is to build models that predict TF-DNA binding from sequence, and use such models to score a variant’s impact on TF binding strength. Here, we asked if this mechanistic approach to variant interpretation can be combined with information on genotype-phenotype associations to discover transcription factors regulating phenotypic variation among individuals.ResultsWe developed a statistical approach that integrates phenotype, genotype, gene expression, TF ChIP-seq, and Hi-C chromatin interaction data to answer this question. Using drug sensitivity of lymphoblastoid cell lines as the phenotype of interest, we tested if non-coding variants statistically linked to the phenotype are enriched for strong predicted impact on DNA binding strength of a TF and thus identified TFs regulating individual differences in the phenotype. Our approach relies on a new method for predicting variant impact on TF-DNA binding that uses a combination of biophysical modeling and machine learning. We report statistical and literature-based support for many of the TFs discovered here as regulators of drug response variation. We show that the use of mechanistically driven variant impact predictors can identify TF-drug associations that would otherwise be missed. We examined in depth one reported association—that of the transcription factor ELF1 with the drug doxorubicin—and identified several genes that may mediate this regulatory relationship.ConclusionOur work represents initial steps in utilizing predictions of variant impact on TF binding sites for discovery of regulatory mechanisms underlying phenotypic variation. Future advances on this topic will be greatly beneficial to the reconstruction of phenotype-associated gene regulatory networks.

Highlights

  • Identification of functional non-coding variants and their mechanistic interpretation is a major challenge of modern genomics, especially for precision medicine

  • We compared a representative of leading k-mer-based methods with an advanced motifbased method to determine their relative merits in predicting Transcription factor (TF) binding strengths and their changes due to single-nucleotide polymorphisms (SNPs)

  • We evaluated the above methods for the TF binding site (TFBS)-SNP impact prediction task, by asking if the SNPs with strongest effects on predicted TF binding, called “binding-change SNPs,” are enriched for allele-specific binding sites (ASB), defined as sites where ChIP-seq read counts are significantly different between alleles [22]

Read more

Summary

Introduction

Identification of functional non-coding variants and their mechanistic interpretation is a major challenge of modern genomics, especially for precision medicine. Transcription factor (TF) binding profiles and epigenomic landscapes in reference samples allow functional annotation of the genome, but do not provide ready answers regarding the effects of non-coding variants on phenotypes. Xie et al BMC Biology (2019) 17:62 from nearby non-functional SNPs. For example, if we have prior knowledge of a relevant transcription factor (TF), the presence of a variant within a TF binding site (TFBS) may add to our confidence in the variant’s regulatory potential; the assumption here is that such a variant influences the TF’s binding to that site and the gene regulatory impact of the TF. Zhang et al [5] adopted such a strategy: they combined a method for predicting changes in TF binding with multi-omics data to identify a SNP that impacts the binding strength of a TF called GATA3 to modulate breast cancer susceptibility

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call