Abstract

Countless women and men worldwide have lost their lives to breast cancer (BC). Although researchers from around the world have proposed various diagnostic methods for detecting this disease, there is still room for improvement in the accuracy and efficiency with which they can be used. A novel approach has been proposed for the early detection of BC by applying data mining techniques to the levels of prolactin (P), testosterone (T), cortisol (C), and human chorionic gonadotropin (HCG) in the blood and saliva of 20 women with histologically confirmed BC, 20 benign subjects, and 20 age-matched control women. In the proposed method, blood and saliva were used to categorize the severity of the BC into normal, benign, and malignant cases. Ten statistical features were collected to identify the severity of the BC using three different classification schemes—a decision tree (DT), a support vector machine (SVM), and k-nearest neighbors (KNN) were evaluated. Moreover, dimensionality reduction techniques using factor analysis (FA) and t-stochastic neighbor embedding (t-SNE) have been computed to obtain the best hyperparameters. The model has been validated using the k-fold cross-validation method in the proposed approach. Metrics for gauging a model’s effectiveness were applied. Dimensionality reduction approaches for salivary biomarkers enhanced the results, particularly with the DT, thereby increasing the classification accuracy from 66.67% to 93.3% and 90%, respectively, by utilizing t-SNE and FA. Furthermore, dimensionality reduction strategies for blood biomarkers enhanced the results, particularly with the DT, thereby increasing the classification accuracy from 60% to 80% and 93.3%, respectively, by utilizing FA and t-SNE. These findings point to t-SNE as a potentially useful feature selection for aiding in the identification of patients with BC, as it consistently improves the discrimination of benign, malignant, and control healthy subjects, thereby promising to aid in the improvement of breast tumour early detection.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call