Abstract

Many cell activities are organized as a network, and genes are clustered into co-expressed groups if they have the same or closely related biological function or they are co-regulated. In this study, based on an assumption that a strong candidate disease gene is more likely close to gene groups in which all members coordinately differentially express than individual genes with differential expression, we developed a novel disease gene prioritization method GroupRank by integrating gene co-expression and differential expression information generated from microarray data as well as PPI network. A candidate gene is ranked high using GroupRank if it is differentially expressed in disease and control or is close to differentially co-expressed groups in PPI network. We tested our method on data sets of lung, kidney, leukemia and breast cancer. The results revealed GroupRank could efficiently prioritize disease genes with significantly improved AUC value in comparison to the previous method with no consideration of co-exprssed gene groups in PPI network. Moreover, the functional analyses of the major contributing gene group in gene prioritization of kidney cancer verified that our algorithm GroupRank not only ranks disease genes efficiently but also could help us identify and understand possible mechanisms in important physiological and pathological processes of disease.

Highlights

  • It remains a big challenge to detect associations between diseases and genes many disease candidate genes haven been reported through genetic studies such as linkage analysis [1] and association studies [2]

  • In a variety of data sources, fast accumulating protein-protein interaction (PPI) data is a valuable resource for gene prioritization because the genes tend to be highly connected in the protein-protein interaction network when they are related to a specific biological function or similar disease phenotype [5]

  • In order to prioritize disease genes more precisely and robustly, we proposed a new algorithm called GroupRank to rank disease genes by integrating PPI network and gene groups clustered by coordinately differential expression

Read more

Summary

Introduction

It remains a big challenge to detect associations between diseases and genes many disease candidate genes haven been reported through genetic studies such as linkage analysis [1] and association studies [2]. Endeavour is a well-developed tool that ranks the candidates against the profile of the training set of genes known to be involved in a biological process or a disease of interest, combining 20 data sources such as functional annotations, expression data, regulatory information, literature, pathways, interactions, sequence, and disease probabilities [3,4]. Comprising the interactions from HPRD [9], BIND [10], BioGrid [11], IntAct [12] and DIP [13], GeneWanderer ranks candidate genes using a global network distance measure and random walk analysis for the definition of similarities to known disease genes in protein-protein interaction networks

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call