Abstract

BackgroundValidation of a novel gene expression signature in independent data sets is a critical step in the development of a clinically useful test for cancer patient risk-stratification. However, validation is often unconvincing because the size of the test set is typically small. To overcome this problem we used publicly available breast cancer gene expression data sets and a novel approach to data fusion, in order to validate a new breast tumor intrinsic list.ResultsA 105-tumor training set containing 26 sample pairs was used to derive a new breast tumor intrinsic gene list. This intrinsic list contained 1300 genes and a proliferation signature that was not present in previous breast intrinsic gene sets. We tested this list as a survival predictor on a data set of 311 tumors compiled from three independent microarray studies that were fused into a single data set using Distance Weighted Discrimination. When the new intrinsic gene set was used to hierarchically cluster this combined test set, tumors were grouped into LumA, LumB, Basal-like, HER2+/ER-, and Normal Breast-like tumor subtypes that we demonstrated in previous datasets. These subtypes were associated with significant differences in Relapse-Free and Overall Survival. Multivariate Cox analysis of the combined test set showed that the intrinsic subtype classifications added significant prognostic information that was independent of standard clinical predictors. From the combined test set, we developed an objective and unchanging classifier based upon five intrinsic subtype mean expression profiles (i.e. centroids), which is designed for single sample predictions (SSP). The SSP approach was applied to two additional independent data sets and consistently predicted survival in both systemically treated and untreated patient groups.ConclusionThis study validates the "breast tumor intrinsic" subtype classification as an objective means of tumor classification that should be translated into a clinical assay for further retrospective and prospective validation. In addition, our method of combining existing data sets can be used to robustly validate the potential clinical value of any new gene expression profile.

Highlights

  • Validation of a novel gene expression signature in independent data sets is a critical step in the development of a clinically useful test for cancer patient risk-stratification

  • Identification of the Intrinsic/UNC gene set Our goals were to (1) create a new breast tumor intrinsic list, (2) validate this list on an independent dataset to show the clinical significance of the "intrinsic" classifications, and (3) to derive an objective "intrinsic subtype" classifier that could be used clinically

  • Combined Test Set A test set of 311 tumors and 4 normal breast samples represented by 315 arrays and 2800 genes in common, was created by combining the datasets of Sorlie et al (2001; 2003), van't Veer et al (2002) and Sotiriou et al (2003)

Read more

Summary

Introduction

Validation of a novel gene expression signature in independent data sets is a critical step in the development of a clinically useful test for cancer patient risk-stratification. We illustrate the successful application of these principles by (1) deriving a new breast tumor intrinsic gene list that identifies the "intrinsic" biological features of breast tumors and (2) validating this predictor using a combined test set of 311 breast tumor samples compiled from the public domain. These analyses show that the breast tumor intrinsic subtypes are significant predictors of outcome when correcting for standard clinical parameters, and that common patterns of expression and outcome predictions can be identified in data sets generated by independent labs

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.