Abstract

For many complex diseases, an earlier and more reliable diagnosis is considered a key prerequisite for developing more effective therapies to prevent or delay disease progression. Classical statistical learning approaches for specimen classification using omics data, however, often cannot provide diagnostic models with sufficient accuracy and robustness for heterogeneous diseases like cancers or neurodegenerative disorders. In recent years, new approaches for building multivariate biomarker models on omics data have been proposed, which exploit prior biological knowledge from molecular networks and cellular pathways to address these limitations. This survey provides an overview of these recent developments and compares pathway- and network-based specimen classification approaches in terms of their utility for improving model robustness, accuracy and biological interpretability. Different routes to translate omics-based multifactorial biomarker models into clinical diagnostic tests are discussed, and a previous study is presented as example.

Highlights

  • In spite of the remarkable advances in biomedicine over recent decades, for a wide range of common, systemic and chronic diseases, precise molecular markers for early diagnosis are not yet available

  • Building multivariate biomarker models derived from highthroughput omics measurements, e.g. using DNA or protein

  • Normalized gene expression data is mapped onto a protein interaction network and discriminative subnetworks are identified via a greedy search procedure

Read more

Summary

Introduction

In spite of the remarkable advances in biomedicine over recent decades, for a wide range of common, systemic and chronic diseases, precise molecular markers for early diagnosis are not yet available. One of the first machine learning approaches for high-throughput data analysis guided by pathway knowledge was proposed by Guo et al [27] Their method classified microarray cancer samples by computing mean or median expression levels of the gene members in biological process modules from the Gene Ontology (GO) database as input for a decision tree classifier [36]. To extract discriminative features for diagnostic specimen classification from these networks, they proposed a link-based classification approach, comparing the activity status of gene regulatory interactions (called ‘links’) across different sample groups, and a degree-based classification method, comparing topological centrality measures [51] for the networks When testing these approaches on data from different cancer case-control studies, high cross-validated accuracies were reported for both cell type and patient sample classification. Instead of constructing new regulatory networks, discriminative disease-associated network alterations can be identified by computationally mapping omics data onto in silico

Methodology publication
Limitations and possible solution strategies
Conclusions
Findings
Key Points
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.