FocusHeuristics \u2013 expression-data-driven network optimization and disease gene prediction

Mathias Ernst,Mohamed Hamed,Yang Du,Sina Sender,Hugo Murua Escobar,Stephan Struckmann,Karlhans Endlich,Steffen Möller,Christian Junghanß,Georg Fuellen,Gregor Warsow,Lisa-Madeleine Sklarz,Nicole Endlich

doi:10.1038/srep42638

Mathias Ernst, Mohamed Hamed + Show 11 more

Open Access

https://doi.org/10.1038/srep42638

Copy DOI

Journal: Scientific Reports	Publication Date: Feb 16, 2017
Citations: 17	License type: open-access

Affiliation: University of Rostock, University of Greifswald

Abstract

To identify genes contributing to disease phenotypes remains a challenge for bioinformatics. Static knowledge on biological networks is often combined with the dynamics observed in gene expression levels over disease development, to find markers for diagnostics and therapy, and also putative disease-modulatory drug targets and drugs. The basis of current methods ranges from a focus on expression-levels (Limma) to concentrating on network characteristics (PageRank, HITS/Authority Score), and both (DeMAND, Local Radiality). We present an integrative approach (the FocusHeuristics) that is thoroughly evaluated based on public expression data and molecular disease characteristics provided by DisGeNet. The FocusHeuristics combines three scores, i.e. the log fold change and another two, based on the sum and difference of log fold changes of genes/proteins linked in a network. A gene is kept when one of the scores to which it contributes is above a threshold. Our FocusHeuristics is both, a predictor for gene-disease-association and a bioinformatics method to reduce biological networks to their disease-relevant parts, by highlighting the dynamics observed in expression data. The FocusHeuristics is slightly, but significantly better than other methods by its more successful identification of disease-associated genes measured by AUC, and it delivers mechanistic explanations for its choice of genes.

Highlights

This paper compares the above mentioned methods and further includes Limma[8] as a representative of a solely expression-level based approach
Our FocusHeuristics employs three different scores using the differential expression data: the log fold change scores the difference of gene expression for the two conditions (LFC), the differential link score (LSd)[6] scores links with different activity in the two conditions, and the interaction link score (LSi) scores links that are highly active in both conditions
For the specific case of acute megakaryoblastic leukemia (AMKL, known as AML-M7), which is a rare subtype of acute leukemia and associated with negative outcome and bad prognosis, we found that our method is better than the other methods based on the area under the ROC curve, see Fig. 4

Summary

Introduction

This paper compares the above mentioned methods and further includes Limma[8] as a representative of a solely expression-level based approach. Some well-known network reduction methods are added, which are based on topology measures for single genes such as the node degree (number of interacting partners), the clustering coefficient (for a given node, the number of edges between its neighbours divided by the number of all such edges possible) and the betweenness (relative number of occurrences on shortest paths). The former two have recently been suggested as a source of disease gene markers[9]. We demonstrate the competitiveness of the FocusHeuristics and discuss the different contributions of gene expression and network data to predicting gene-disease-association in general

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

FocusHeuristics \u2013 expression-data-driven network optimization and disease gene prediction

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports

Lead the way for us

Similar Papers

Identification of critical genes and molecular pathways in COVID-19 myocarditis and constructing gene regulatory networks by bioinformatic analysis.
Fengjun Zhang ... Junchen Feng
PLOS ONE | VOL. 17
Fengjun Zhang, et. al.Fengjun Zhang ... Junchen Feng
24 Jun 2022
PLOS ONE | VOL. 17

Identification of potential crucial genes in atrial fibrillation: a bioinformatic analysis
Junguo Zhang ... Xin Huang
BMC Medical Genomics | VOL. 13
Junguo Zhang, et. al.Junguo Zhang ... Xin Huang
18 Jul 2020
BMC Medical Genomics | VOL. 13

Network approaches for identification of human genetic disease genes
Dzung Tien Tran ... Minh-Tan Nguyen
Vietnam Journal of Science and Technology | VOL. 60
Dzung Tien Tran, et. al.Dzung Tien Tran ... Minh-Tan Nguyen
31 Aug 2022
Vietnam Journal of Science and Technology | VOL. 60

Identification and characterization of long non-coding RNAs in muscle sclerosis of grass carp, Ctenopharyngodon idellus fed with faba bean meal
Lian Gan ... Hong-Hong Guo
Aquaculture | VOL. 516
Lian Gan, et. al.Lian Gan ... Hong-Hong Guo
16 Sep 2019
Aquaculture | VOL. 516

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

FocusHeuristics \u2013 expression-data-driven network optimization and disease gene prediction

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports