Knowledge-guided multi-scale independent component analysis for biomarker identification.

Li Chen,Zhen Zhang,Ie-Ming Shih,Jianhua Xuan,Robert Clarke,Chen Wang,Yue Wang,Eric Hoffman

doi:10.1186/1471-2105-9-416

Li Chen, Zhen Zhang + Show 6 more

Open Access

PDF Available

https://doi.org/10.1186/1471-2105-9-416

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

BackgroundMany statistical methods have been proposed to identify disease biomarkers from gene expression profiles. However, from gene expression profile data alone, statistical methods often fail to identify biologically meaningful biomarkers related to a specific disease under study. In this paper, we develop a novel strategy, namely knowledge-guided multi-scale independent component analysis (ICA), to first infer regulatory signals and then identify biologically relevant biomarkers from microarray data.ResultsSince gene expression levels reflect the joint effect of several underlying biological functions, disease-specific biomarkers may be involved in several distinct biological functions. To identify disease-specific biomarkers that provide unique mechanistic insights, a meta-data "knowledge gene pool" (KGP) is first constructed from multiple data sources to provide important information on the likely functions (such as gene ontology information) and regulatory events (such as promoter responsive elements) associated with potential genes of interest. The gene expression and biological meta data associated with the members of the KGP can then be used to guide subsequent analysis. ICA is then applied to multi-scale gene clusters to reveal regulatory modes reflecting the underlying biological mechanisms. Finally disease-specific biomarkers are extracted by their weighted connectivity scores associated with the extracted regulatory modes. A statistical significance test is used to evaluate the significance of transcription factor enrichment for the extracted gene set based on motif information. We applied the proposed method to yeast cell cycle microarray data and Rsf-1-induced ovarian cancer microarray data. The results show that our knowledge-guided ICA approach can extract biologically meaningful regulatory modes and outperform several baseline methods for biomarker identification.ConclusionWe have proposed a novel method, namely knowledge-guided multi-scale ICA, to identify disease-specific biomarkers. The goal is to infer knowledge-relevant regulatory signals and then identify corresponding biomarkers through a multi-scale strategy. The approach has been successfully applied to two expression profiling experiments to demonstrate its improved performance in extracting biologically meaningful and disease-related biomarkers. More importantly, the proposed approach shows promising results to infer novel biomarkers for ovarian cancer and extend current knowledge.

Highlights

Many statistical methods have been proposed to identify disease biomarkers from gene expression profiles
The yeast cell cycle data set consists of the expression of 6178 Open Reading Frames (ORFs) during the cell replication cycle in the budding yeast (Saccharomyces cerevisiae)
Since none of the genes are in the Knowledge gene pool (KGP), they were entered into an Ingenuity Pathways Analysis (IPA) where we found that all of these genes can be incorporated into a single hypothetical network (Fig. 15)

Summary

Introduction

Many statistical methods have been proposed to identify disease biomarkers from gene expression profiles. We develop a novel strategy, namely knowledge-guided multi-scale independent component analysis (ICA), to first infer regulatory signals and identify biologically relevant biomarkers from microarray data. Under their broadest definition, biomarkers include any biological or chemical indicator of a specific underlying process. Conesa et al proposed a two-step regression approach to sequentially identify differentially expressed genes from time-course microarray data under different conditions [4]. These and many related approaches do not incorporate knowledge of gene function, with respect to the phenotypes of interest, into their statistical models

Objectives

Methods

Results

Discussion

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Oct 6, 2008
Citations: 24	License type: cc-by

R Discovery Prime

Knowledge-guided multi-scale independent component analysis for biomarker identification.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

MicroRNA and the pathogenesis of ovarian cancer – a new horizon for molecular diagnostics and treatment?
Jan Dominik Kuhlmann ... Jens Rasch
Clinical Chemistry and Laboratory Medicine | VOL. 50
Jan Dominik Kuhlmann, et. al.Jan Dominik Kuhlmann ... Jens Rasch
01 Jan 2012
Clinical Chemistry and Laboratory Medicine | VOL. 50

Gene co-expression network analysis revealed novel biomarkers for ovarian cancer.
Ceyda Kasavi
Frontiers in genetics | VOL. 13
Ceyda KasaviCeyda Kasavi
19 Oct 2022
Frontiers in genetics | VOL. 13

Detection of novel biomarkers for ovarian cancer with an optical nanotechnology detection system enabling label-free diagnostics
Simon Kaja
Journal of Biomedical Optics | VOL. 17
Simon KajaSimon Kaja
14 Jun 2012
Journal of Biomedical Optics | VOL. 17

Glycoprotomic Approaches Enable to Find Novel Biomarker for Ovarian Cancer, Especially for Clear Cell Carcinoma
...
Abstracts for Annual Meeting of Japanese Proteomics Society | VOL. 2014
, et. al. ...
01 Jan 2014
Abstracts for Annual Meeting of Japanese Proteomics Society | VOL. 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Knowledge-guided multi-scale independent component analysis for biomarker identification.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: BMC Bioinformatics