Abstract

The PAM50 classifier is widely used for breast tumor intrinsic subtyping based on gene expression. Clinical subtyping, however, is based on immunohistochemistry assays of 3–4 biomarkers. Subtype calls by these two methods do not completely match even on comparable subtypes. Nevertheless, the estrogen receptor (ER)-balanced subset for gene-centering in PAM50 subtyping, is selected based on clinical ER status. Here we present a new method called Principle Component Analysis-based iterative PAM50 subtyping (PCA-PAM50) to perform intrinsic subtyping in ER status unbalanced cohorts. This method leverages PCA and iterative PAM50 calls to derive the gene expression-based ER status and a subsequent ER-balanced subset for gene centering. Applying PCA-PAM50 to three different breast cancer study cohorts, we observed improved consistency (by 6–9.3%) between intrinsic and clinical subtyping for all three cohorts. Particularly, a more aggressive subset of luminal A (LA) tumors as evidenced by higher MKI67 gene expression and worse patient survival outcomes, were reclassified as luminal B (LB) increasing the LB subtype consistency with IHC by 25–49%. In conclusion, we show that PCA-PAM50 enhances the consistency of breast cancer intrinsic and clinical subtyping by reclassifying an aggressive subset of LA tumors into LB. PCA-PAM50 code is available at ftp://ftp.wriwindber.org/.

Highlights

  • Breast cancer (BC) is one of the few tumor types with established molecular classification and targeted treatment regimen that yield improved clinical outcomes[1,2,3,4]

  • Our method resulted in improved consistency between PAM50 calls and IHC subtypes compared to the conventional method for all three cohorts, and this improved consistency is attributable to re-classification of an aggressive subset of luminal A (LA) tumors as luminal B (LB)

  • In our in-house RNA-Seq dataset, the principal component analyses (PCA) map grouping of cases overlaid with IHC subtypes indicated that the principal component 1 (PC1) parted most, but not all, of the estrogen receptor (ER)- positive (LA + LB1 + LB2) and ER-negative (TN and Her2+) cases (Fig. 2A)

Read more

Summary

Introduction

Breast cancer (BC) is one of the few tumor types with established molecular classification and targeted treatment regimen that yield improved clinical outcomes[1,2,3,4]. The PAM50 classifier works accurately if the original cohort/dataset is ER status-balanced This is often not the case with most genome-wide studies. There have been reports that IHC-defined ER status, which is based on protein expression, not being completely consistent with ER status defined by gene expression[22,23] This inconsistency may impact the accuracy of the subsequent gene centering procedure which is aimed to minimize the bias of the dynamic range of the expression profile per sequencing technology. As a result, such inconsistency may contribute to the discrepancy between the IHC and PAM50 subtyping results. Our method resulted in improved consistency between PAM50 calls and IHC subtypes compared to the conventional method for all three cohorts, and this improved consistency is attributable to re-classification of an aggressive subset of LA tumors as LB

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call