Abstract
The vast majority of the genetic variants (mainly SNPs) associated with various human traits and diseases map to a noncoding part of the genome and are enriched in its regulatory compartment, suggesting that many causal variants may affect gene expression. The leading mechanism of action of these SNPs consists in the alterations in the transcription factor binding via creation or disruption of transcription factor binding sites (TFBSs) or some change in the affinity of these regulatory proteins to their cognate sites. In this review, we first focus on the history of the discovery of regulatory SNPs (rSNPs) and systematized description of the existing methodical approaches to their study. Then, we brief the recent comprehensive examples of rSNPs studied from the discovery of the changes in the TFBS sequence as a result of a nucleotide substitution to identification of its effect on the target gene expression and, eventually, to phenotype. We also describe state-of-the-art genome-wide approaches to identification of regulatory variants, including both making molecular sense of genome-wide association studies (GWAS) and the alternative approaches the primary goal of which is to determine the functionality of genetic variants. Among these approaches, special attention is paid to expression quantitative trait loci (eQTLs) analysis and the search for allele-specific events in RNA-seq (ASE events) as well as in ChIP-seq, DNase-seq, and ATAC-seq (ASB events) data.
Highlights
A central goal of human genetics is to understand how genetic variation leads to phenotypic differences and complex diseases
The corresponding information encoded in regulatory DNA is actuated via the combinatorial binding of sequence-specific transcription factors (TFs) to regulatory regions
Many SNPs with such properties have been so far discovered, their mass search in genomes remains challenging. This is mainly associated with the tissue, developmental, and environmental specificities in the effects of regulatory SNPs (rSNPs), which is a direct consequence of the corresponding specificities of the harboring cisregulatory elements [32,45,124]
Summary
A central goal of human genetics is to understand how genetic variation leads to phenotypic differences and complex diseases. The regulatory regions of the genome represent clusters of the binding sites for sequence-specific transcription factors (TFs). Thanks to the binding to their specific sites on DNA (transcription factor binding sites, TFBSs), TFs directly interpret the regulatory part of the genome, performing the first step in deciphering the DNA sequence [13,14,15]. Regulatory SNPs (rSNPs), that is, genetic variation within TFBSs that alters expression, play a central role in the phenotypic variation in complex traits, including the risk of developing a disease. Expression quantitative trait locus (eQTL) mapping and identification of allele-specific expression (ASE) events utilizing analysis of RNA-seq data (the largest available genome-wide dataset) are the major relevant methods. We brief the history of rSNP discovery, systematize and discuss the methods used in the studies of individual rSNPs, illustrate the narration with the case studies of several best-characterized rSNPs associated with different pathologies, and summarize the recent published data on the genome-wide approaches to the discovery and study of rSNPs
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.