Abstract

Background Recent studies show evidence that non-coding variants may play an important role in disease etiology, including psychiatric disorders. However, prioritizing these less-understood variants and selecting the right candidates for further investigation remains a central challenge in the field. Current tools for annotating non-coding genetic variations provide a general indicator of their deleteriousness, but lack, for example, tissue-specific context that could better illuminate their role in a particular disease. Methods In this work, we propose a new machine learning-based approach that relies on tissue-specific data to estimate variant impact on brain tissues. By integrating information from several genome-scale databases, including GTEx and RoadMap Epigenomics, we derive tissue-related features. Using this data representation, we train a predictive random forest model to discriminate variants with prior evidence for brain relevance from variants unlikely to affect the brain. The resulting model predictions, which we call the Brain Relevance Score (BRS), are an estimate of how related a genome position is to the brain. Results After computing BRS for every nucleotide position in the human genome, we validate it on genomic regions known to be related to psychiatric disorders, such as the 16p11.2 region. In this region, we identify a short list of candidate variants, close from known genes involved in brain disease and mental disorders. We then use BRS as a filter and combine it with state of the art deleteriousness score (e.g., CADD) and report higher sensitivity in detecting brain-related damaging variants in the Simons Simplex Collection data for autism spectrum disorder, compared to 1000Genomes control data set. Discussion Even if we reported high performance in detecting both coding and non-coding variants related to the human brain, ongoing work in our lab involves benchmarking more competitive learning approaches and integrating additional brain databases in the model. The learning framework we demonstrate here could be easily applied to other psychiatric disorders.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call