Abstract
Cancer in breasts appears as a terrible malediction in society. It snitches huge human lives across the world and its peril is going to increase at a startling rate. Identification of this disease at the initial stages is indispensable. In many cases, traditional methods are prone to errors and protracted. Models applying machine learning approaches have been shown fruitful in this application area. There are large numbers of approaches in machine learning which demonstrate impressive results. This research strives to take out the short comings from the existing models and, by resolving the underlying technical issues, deliver higher accuracy in end results. The research motivates and endeavours to make the patients' treatment processes more justified and cost-effective. The research works with WDBC dataset for breast cancer, which is publicly accessible from the UCI research database. This study uses multiple individual learners, namely Support Vector Machines (SVM), Logistic Regression(LR), Random Forest(RF), Naive Bayes(NB), K-Nearest Neighbours(K-NN), Decision Tree(DT) and an ensemble learner called Gradient Boosting(GB) with multiple techniques of feature selection namely Principal Component Analysis (PCA) and Recursive Feature Elimination (RFE). The experimental techniques discern subtle patterns within the dataset. The proposed model evaluates the results and performances through metrics specificity, sensitivity and accuracy in a comparative structure. It succeeds with higher accuracy of 98%. The study highlights its potential as a significant tool in medical diagnostics.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have