Abstract

this study concentrates on Predicting Breast Cancer Survivability using data mining, and comparing between three main predictive modeling tools. Precisely, we used three popular data mining methods: two from machine learning (artificial neural network and decision trees) and one from statistics (logistic regression), and aimed to choose the best model through the efficiency of each model and with the most effective variables to these models and the most common important predictor. We defined the three main modeling aims and uses by demonstrating the purpose of the modeling. By using data mining, we can begin to characterize and describe trends and patterns that reside in data and information. The preprocessed data set contents were of 87 variables and the total of the records are 457,389; which became 93 variables and 90308 records for each variable, and these dataset were from the SEER database. We have achieved more than three data mining techniques and we have investigated all the data mining techniques and finally we find the best thing to do is to focus about these data mining techniques which are Artificial Neural Network, Decision Trees and Logistic Regression by using SAS Enterprise Miner 5.2 which is in our view of point is the suitable system to use according to the facilities and the results given to us. Several experiments have been conducted using these algorithms. The achieved prediction implementations are Comparison-based techniques. However, we have found out that the neural network has a much better performance than the other two techniques. Finally, we can say that the model we chose has the highest accuracy which specialists in the breast cancer field can use and depend on.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.