Abstract

Breast cancer is a common and potentially life-threatening disease. Early and accurate diagnosis of breast cancer is crucial for effective treatment and improved patient outcomes. This study proposed using the Light Gradient-Boosting Machine (LightGBM) algorithm, Borderline- Synthetic Minority Oversampling Technique (SMOTE), and the Tree-Structured Parzen Estimator (TPE) for hyperparameter tuning to enhance the effectiveness of the Machine Learning (ML) model for diagnosing breast cancer. A 10-fold cross-validated TPE optimized Borderline-SMOTE LightGBM classifier was modelled on the Wisconsin Diagnostic Breast Cancer (WDBC) Dataset and evaluated for its performance compared to a baseline LightGBM model. The TPE-optimized Borderline-SMOTE LightGBM model exhibited a significant improvement in performance over the baseline model, achieving an average accuracy of 99.12%, specificity of 100%, precision of 100%, recall of 97.62%, F1-score of 98.80%, and a Mathews Correlation Coefficient of 98.12%. Compared to previous studies, the TPE-optimized Borderline-SMOTE LightGBM model performed exceptionally well. The study demonstrates the effectiveness of using data augmentation and hyperparameter optimization techniques to improve the performance of ML models for breast cancer diagnosis, which has significant implications for the medical field where the accurate and efficient diagnosis of breast cancer is critical.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call