Abstract

BackgroundBiomarker discovery holds the promise for advancing personalized medicine as the biomarkers can help match patients to optimal treatment to improve patient outcomes. However, serious concerns have been raised because very few molecular biomarkers or signatures discovered from high dimensional array data can be successfully validated and applied to clinical use. We propose good practice guidelines as well as a novel tool for biomarker discovery and use breast cancer prognosis as a case study to illustrate the proposed approach.ResultsWe applied the proposed approach to a publicly available breast cancer prognosis dataset and identified small numbers of predictive markers for patient subpopulations stratified by clinical variables. Results from an independent cross-platform validation set show that our model compares favorably to other gene signature and clinical variable based prognostic tools. About half of the discovered candidate markers can individually achieve very good performance, which further demonstrate the high quality of feature selection. These candidate markers perform extremely well for young patient with estrogen receptor-positive, lymph node-negative early stage breast cancers, suggesting a distinct subset of these patients identified by these markers is actually at high risk of recurrence and may benefit from more aggressive treatment than cur-rent practice.ConclusionThe results show that by following good practice guidelines, we can identify highly predictive genes in high dimensional breast cancer array data. These predictive genes have been successfully validated using an independent cross-platform dataset.

Highlights

  • The goal of biomarker discovery from high dimensional array data is to find an individual or a set of genes whose expression pattern can predict certain phenotype or clinical outcome

  • As a case study, we use the van de Vijver data set [7] to discover prognostic biomarkers for various patient subsets stratified by clinical variables

  • The markers discovered from the lymph node negative patient cohort are subsequently evaluated using an independent cross-platform dataset: TRANSBIG [8]

Read more

Summary

Introduction

The goal of biomarker discovery from high dimensional array data is to find an individual or a set of genes (or any other molecular variables) whose expression pattern can predict certain phenotype or clinical outcome. During the past 15 years, numerous biomarkers and gene signatures have been published in the literature. Few of these biomarkers can be successfully validated and applied in clinical setting, which have caused serious concerns in biomedical research community [1]. (1) Many published gene signatures cannot be validated independently This is mainly due to flawed data analysis. Serious concerns have been raised because very few molecular biomarkers or signatures discovered from high dimensional array data can be successfully validated and applied to clinical use. We propose good practice guidelines as well as a novel tool for biomarker discovery and use breast cancer prognosis as a case study to illustrate the proposed approach

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call