Abstract

3031 Background: Mammographic screening has enabled early detection of breast cancer, both in a general population and in women with increased risk of breast cancer. However, mammography yields many false positive results, leading to unnecessary invasive diagnostic procedures, and has limited sensitivity, particularly in women with high breast density. Blood-based markers may improve breast cancer screening, but no marker has proven sufficiently sensitive and specific for this purpose thus far. The mRNA repertoire in blood platelets (tumor educated platelets, TEPs) differs between patients with cancer and healthy controls. In this study, we aimed to train a classification algorithm on TEP mRNA profiles to distinguish patients with breast cancer from healthy controls. Methods: Platelet mRNA was sequenced from 266 women with stage I-IV breast cancer and 214 female asymptomatic controls from six different hospitals. First, a particle-swarm optimized support vector machine (PSO-SVM) classifier was trained (Best et al., Nature Protocols, 2019). To this end, 71% of the dataset was randomly allocated to train the algorithm, while the remaining 29% was used for internal validation. Second, an alternative classifier was trained on the same samples as in the PSO-SVM using elastic net (EN) regression. Reproducibility of classifier performance was evaluated in a single-center, independent, blinded set, consisting of cases (n = 37) and age-matched controls (n = 36). Post-hoc analyses were performed to assess the influence of hospital of origin and other factors on TEP gene expression and classifier performance. Results: Performance of both classifiers in the internal validation set was adequate with an area under the curve (AUC) of 0.86 for the PSO-SVM and 0.87 for the EN classifier. A strong correlation was observed between case control status and hospital of origin (Fisher’s exact test, p < 0.001). Performance in the single-center, independent set was poor with an AUC of 0.57 and 0.60 for the PSO-SVM and EN, respectively. Post-hoc analyses indicated that 25% of the variance in gene expression was associated with hospital of origin, 6% with case control status, whereas 69% remained unexplained. Gene expression related to platelet activity was significantly different between the two hospitals that contributed most samples, and between cases and controls. Conclusions: We were unable to successfully validate two TEP RNA based classifiers for breast cancer detection in a single-center, independent, blinded set, regardless of the algorithm employed. Gene expression was severely influenced by hospital of origin and other factors unrelated to case-control status, suggesting that the wet lab protocol is highly sensitive to within-protocol variations in execution. Therefore, we suggest that thorough revision of the protocol is necessary before TEP RNA based classifiers can be reconsidered for breast cancer detection in the future.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call