Abstract

The vast majority of the genetic variants (mainly SNPs) associated with various human traits and diseases map to a noncoding part of the genome and are enriched in its regulatory compartment, suggesting that many causal variants may affect gene expression. The leading mechanism of action of these SNPs consists in the alterations in the transcription factor binding via creation or disruption of transcription factor binding sites (TFBSs) or some change in the affinity of these regulatory proteins to their cognate sites. In this review, we first focus on the history of the discovery of regulatory SNPs (rSNPs) and systematized description of the existing methodical approaches to their study. Then, we brief the recent comprehensive examples of rSNPs studied from the discovery of the changes in the TFBS sequence as a result of a nucleotide substitution to identification of its effect on the target gene expression and, eventually, to phenotype. We also describe state-of-the-art genome-wide approaches to identification of regulatory variants, including both making molecular sense of genome-wide association studies (GWAS) and the alternative approaches the primary goal of which is to determine the functionality of genetic variants. Among these approaches, special attention is paid to expression quantitative trait loci (eQTLs) analysis and the search for allele-specific events in RNA-seq (ASE events) as well as in ChIP-seq, DNase-seq, and ATAC-seq (ASB events) data.

Highlights

  • A central goal of human genetics is to understand how genetic variation leads to phenotypic differences and complex diseases

  • The corresponding information encoded in regulatory DNA is actuated via the combinatorial binding of sequence-specific transcription factors (TFs) to regulatory regions

  • Many SNPs with such properties have been so far discovered, their mass search in genomes remains challenging. This is mainly associated with the tissue, developmental, and environmental specificities in the effects of regulatory SNPs (rSNPs), which is a direct consequence of the corresponding specificities of the harboring cisregulatory elements [32,45,124]

Read more

Summary

Introduction

A central goal of human genetics is to understand how genetic variation leads to phenotypic differences and complex diseases. The regulatory regions of the genome represent clusters of the binding sites for sequence-specific transcription factors (TFs). Thanks to the binding to their specific sites on DNA (transcription factor binding sites, TFBSs), TFs directly interpret the regulatory part of the genome, performing the first step in deciphering the DNA sequence [13,14,15]. Regulatory SNPs (rSNPs), that is, genetic variation within TFBSs that alters expression, play a central role in the phenotypic variation in complex traits, including the risk of developing a disease. Expression quantitative trait locus (eQTL) mapping and identification of allele-specific expression (ASE) events utilizing analysis of RNA-seq data (the largest available genome-wide dataset) are the major relevant methods. We brief the history of rSNP discovery, systematize and discuss the methods used in the studies of individual rSNPs, illustrate the narration with the case studies of several best-characterized rSNPs associated with different pathologies, and summarize the recent published data on the genome-wide approaches to the discovery and study of rSNPs

Brief History of rSNP Discovery
Modern Array of Methods for Studying Individual rSNPs
Method
Recent Comprehensive Examples
Making Molecular Sense of GWAS
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call