In Silico Gene Prioritization by Integrating Multiple Data Sources

Yixuan Chen,Robert Shields,Jing Li,Yingyao Zhou,Robert C Elston,Sumit K Chanda,Wenhui Wang,Mike B Gravenor

doi:10.1371/journal.pone.0021137

Abstract

Identifying disease genes is crucial to the understanding of disease pathogenesis, and to the improvement of disease diagnosis and treatment. In recent years, many researchers have proposed approaches to prioritize candidate genes by considering the relationship of candidate genes and existing known disease genes, reflected in other data sources. In this paper, we propose an expandable framework for gene prioritization that can integrate multiple heterogeneous data sources by taking advantage of a unified graphic representation. Gene-gene relationships and gene-disease relationships are then defined based on the overall topology of each network using a diffusion kernel measure. These relationship measures are in turn normalized to derive an overall measure across all networks, which is utilized to rank all candidate genes. Based on the informativeness of available data sources with respect to each specific disease, we also propose an adaptive threshold score to select a small subset of candidate genes for further validation studies. We performed large scale cross-validation analysis on 110 disease families using three data sources. Results have shown that our approach consistently outperforms other two state of the art programs. A case study using Parkinson disease (PD) has identified four candidate genes (UBB, SEPT5, GPR37 and TH) that ranked higher than our adaptive threshold, all of which are involved in the PD pathway. In particular, a very recent study has observed a deletion of TH in a patient with PD, which supports the importance of the TH gene in PD pathogenesis. A web tool has been implemented to assist scientists in their genetic studies.

Highlights

Dissecting genetic architectures of human diseases is a fundamental task in human genetics and has profound implications in biomedical research
We identified four candidate genes (UBB, septin 5 (SEPT5), G protein-coupled receptor 37 (GPR37) and Tyrosine hydroxylase (TH)) that ranked higher than our adaptive threshold, all of which are involved in the Parkinson disease (PD) pathway
We have proposed a candidate gene prioritization approach that can integrate multiple data sources by taking advantage of a unified graphic representation of information

Summary

Introduction

Dissecting genetic architectures of human diseases is a fundamental task in human genetics and has profound implications in biomedical research. Great challenges exist because many common diseases are caused by multiple disease genes with small to moderate effects. Even diseases that show Mendelian inheritance may involve multiple genes due to heterogeneity. Researchers have increasingly realized that there are many levels of controls along the paths from genotypes to phenotypes, resulting in a weaker relationship between genotypes and phenotypes [1] that may or may not be captured using traditional linkage or association approaches. Linkage analysis usually can only identify chromosomal intervals that may contain up to hundreds of candidate genes owning to the limited number of crossovers in sampled families. Genome-wide association studies may return many regions that show moderate to high signals. Experimental validations of so many candidate genes are usually beyond the ability of individual researchers owning to prohibitively high costs, both in terms of fund and time

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLoS ONE	Publication Date: Jun 24, 2011
Citations: 96	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

In Silico Gene Prioritization by Integrating Multiple Data Sources

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE

Lead the way for us

Similar Papers

Integration of multiple data sources to prioritize candidate genes using discounted rating system
Yongjin Li ... Jagdish C Patra
BMC Bioinformatics | VOL. 11
Yongjin Li, et. al.Yongjin Li ... Jagdish C Patra
01 Jan 2009
BMC Bioinformatics | VOL. 11

ANALYSIS OF THE INTERACTIONS OF NEURONAL APOPTOSIS GENES IN THE ASSOCIATIVE GENE NETWORK OF PARKINSON’S DISEASE
M A Yankina ... V А Ivanisenko
Vavilov Journal of Genetics and Breeding | VOL. 22
M A Yankina, et. al.M A Yankina ... V А Ivanisenko
21 Mar 2018
Vavilov Journal of Genetics and Breeding | VOL. 22

Identification of candidate genes for Parkinson's disease through blood transcriptome analysis in LRRK2-G2019S carriers, idiopathic cases, and controls
Jon Infante ... Jesús Sainz
Neurobiology of Aging | VOL. 36
Jon Infante, et. al.Jon Infante ... Jesús Sainz
05 Nov 2014
Neurobiology of Aging | VOL. 36

A Novel Candidate Disease Genes Prioritization Method Based on Module Partition and Rank Fusion
Xing Chen ... Xiao-Ping Liao
OMICS: A Journal of Integrative Biology | VOL. 14
Xing Chen, et. al.Xing Chen ... Xiao-Ping Liao
01 Aug 2010
OMICS: A Journal of Integrative Biology | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

In Silico Gene Prioritization by Integrating Multiple Data Sources

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE