Background and Objective:Chemotherapy is useful to many breast cancer patients, however, it is not therapeutic for some patients. Pathologic complete response (pCR) is an indicator to good response in Neoadjuvant chemotherapy (NAC). In this study, we aimed to develop a way to predict pCR before NAC. Methods:We retrospectively collected 287 stage II-III breast cancer cases either to a training set (N = 197) or to a test set (N = 90). Fourteen candidate genes were selected from four public microarray data sets. A prediction model was built, by using these fourteen candidate genes and three reference genes expression which were tested by TaqMan probe-based quantitative polymerase chain reaction, after selecting a better algorithm. Results:The Naive Bayes algorithm had a relatively higher predictive value, compared with random forest, support vector machine (SVM), and k-nearest neighbor (knn) algorithms (P < 0.05). This 17-gene prediction model showed a high positive correlation with pCR (odds ratio, 8.914, 95% confidence interval, 4.430–17.934, P < 0.001). By using this model, the enrolled patients were classified into sensitive (SE) and insensitive (INS) groups. The pCR rates between the SE and INS groups were highly different (42.3% vs.7.6%, P < 0.001). The sensitivity and specificity of this prediction model were 84.5% and 62.0%. Conclusions:Instead of whole transcriptome-based technologies, panel gene expression with tens of essential genes implemented in a machine learning model has predictive potential for chemosensitivity in breast cancers.
Read full abstract