A support vector machine-based ensemble algorithm for breast cancer diagnosis

Haifeng Wang,Bichen Zheng,Sang Won Yoon,Hoo Sang Ko

doi:10.1016/j.ejor.2017.12.001

Abstract

This research studies a support vector machine (SVM)-based ensemble learning algorithm for breast cancer diagnosis. Illness diagnosis plays a critical role in designating treatment strategies, which are highly related to patient safety. Nowadays, numerous classification models in data mining domains are adapted to breast cancer diagnosis based on patients’ historical medical records. However, the performance of each algorithm depends on various model configurations, such as input feature types and model parameters. To tackle the limitation of individual model performance, this research focuses on breast cancer diagnosis that uses an SVM-based ensemble learning algorithm to reduce the diagnosis variance and increase diagnosis accuracy. Twelve different SVMs, based on the proposed Weighted Area Under the Receiver Operating Characteristic Curve Ensemble (WAUCE) approach, are hybridized. To evaluate the performance of the proposed model, Wisconsin Breast Cancer, Wisconsin Diagnostic Breast Cancer, and the U.S. National Cancer Institute’s Surveillance, Epidemiology, and End Results (SEER) program breast cancer datasets have been studied. The experimental results show that the WAUCE model achieves a higher accuracy with a significantly lower variance for breast cancer diagnosis compared to five other ensemble mechanisms and two common ensemble models, i.e., adaptive boosting and bagging classification tree. The proposed WAUCE model reduces the variance by 97.89% and increases accuracy by 33.34%, compared to the best single SVM model on the SEER dataset. In practice, the proposed methodology can be further applied to other illness diagnoses, which offers an alternative to a safer, more reliable, and more robust illness diagnosis process.

Full Text