Abstract
Introduction: Logistic regression is the classical statistical model that is incorporated to predict a binary outcome variable. These models have theoretical assumptions of independence of predictor variables and linearity of association with the outcome in the logarithmic scale. Alternative models developed in the machine learning context like Naïve Bayes model with similar assumptions and Bayesian Network (BN) model can be used for binary prediction. Aim: To compare the predictive performance of logistic regression, Naïve Bayes and BN model in predicting the recurrence of Breast cancer. Materials and Methods: The dataset was procured from UCI Machine Learning repository on recurrence of breast cancer. The study was done on retrospective data from December 2021 to July 2022. The sample size was boosted with the bootstrapping with logistic regression model. The dataset was split into training (70%) and testing (30%) dataset for internal validation. The effect estimates of the potential prognostic variables were estimated using multiple logistic regression model. Naïve Bayes and BN model was also learnt from the training dataset. The indices of predictive accuracy were estimated for the models in both training and testing dataset. Results: Degree of malignancy and side of affected breast were found to be significant predictors of recurrence of breast cancer. BN model had the least misclassification rate and the best sensitivity in comparison to other models in spite of imbalance in outcome variable. Conclusion: BN model performed the best in comparison to logistic regression model when the assumptions of logistic regression model were violated and there is imbalance in proportion of outcome.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.