Efficient Breast Cancer Prediction Using Ensemble Machine Learning Models

Naveen Naveen,R K Sharma,Anil Ramachandran Nair

doi:10.1109/rteict46194.2019.9016968

Abstract

Breast cancer is the second most exposed cancer in the world. When the growth of breast tissues are out of control is called breast cancer. Breast cancer prediction and prognosis are major challenge to medical community. Breast cancer are prominent cause of death for women. Recurrence of cancer is the biggest fears for cancer patient and this can affect their quality of life. The aim of this research is to predict breast cancer from cancer features with high accuracy. The breast cancer Coimbra dataset taken from UCI (University of California Irvine) [1], [5] to build a most efficient ensemble machine learning models. The major steps we follow, here are feature scaling, cross validation and various ensemble machine learning models with bagging technique. Decision tree and KNN gives highest 100% accuracy. Decision tree model gives 100% accuracy if we split train-test dataset in ratio of 90:10 and also used 300 bags of trees. KNN gives maximum accuracy 100%, for k= 1 to 7 in seven loops with 90% is train data and 10% is test data. Here k is the nearest neighbors. And we also evaluate its prediction by accuracy, confusion matrix and classification report. Our aim is to build a most accurate and efficient machine learning model. So as prediction result, patient can take treatment on the early stage.

Full Text