An Interpretable Explainable Artificial Intelligence Framework for Urinary Cancer Survival Classification using Hybrid Feature Selection and Stacked Ensembles
Introduction: Urinary cancer continues to be a public health hazard throughout the world, indicating that new classification methods for the detection in its early phase are necessary. The Surveillance, Epidemiology, and End Results (SEER) urinary cancer dataset presents a unique challenge in part due to heterogeneous data, variation in clinical features, and missing data, which affects patient outcomes. Methods: This study presents a solution to find missing data using a Random Forest Regressor for numerical data and a Random Forest Classifier for categorical data. It then describes a hybrid analysis combining Analysis of Variance (ANOVA) and Recursive Feature Elimination (RFE) for feature selection in large datasets. Urinary cancer survival classification was performed using seven traditional machine learning models, including Logistic Regression, Naive Bayes, LightGBM, XGBoost, CatBoost, Random Forest, and Decision Tree. Results: The experimental results show that the proposed method demonstrates the highest performance when compared to traditional classifiers. The proposed stacked ensemble model of base layers, which include LightGBM, CatBoost, and TabNet with a meta-layer of Logistic Regression, achieved the highest accuracy of 0.9886 and a ROC score of 0.9889. Discussion: The existing techniques have limited predictive accuracy, poor handling of complex data, and a lack of interpretability needed for clinical decisions, and the proposed stacked ensemble model successfully overcomes these limitations by utilizing a hybrid method of feature selection and ensemble learning for greater robustness. Conclusion: To promote model transparency and explainability, we implement a hybrid Shapley Additive explanations (SHAP) - Local Interpretable Model-agnostic Explanations (LIME) explainer. The results demonstrated improvement in classification accuracy and contributed to understanding model decisions. Overall, the framework was effective in predicting and analyzing urinary cancer survival.
- # Local Interpretable Model-agnostic Explanations
- # Traditional Machine Learning Models
- # Shapley Additive Explanations
- # Surveillance, Epidemiology, And End Results
- # Urinary Cancer
- # Hybrid Feature Selection
- # Recursive Feature Elimination
- # Improvement In Classification Accuracy
- # Stacked Ensemble
- # Random Forest
- Research Article
143
- 10.1038/s41598-023-35795-0
- Jun 2, 2023
- Scientific Reports
Nasopharyngeal cancer (NPC) has a unique histopathology compared with other head and neck cancers. Individual NPC patients may attain different outcomes. This study aims to build a prognostic system by combining a highly accurate machine learning model (ML) model with explainable artificial intelligence to stratify NPC patients into low and high chance of survival groups. Explainability is provided using Local Interpretable Model Agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP) techniques. A total of 1094 NPC patients were retrieved from the Surveillance, Epidemiology, and End Results (SEER) database for model training and internal validation. We combined five different ML algorithms to form a uniquely stacked algorithm. The predictive performance of the stacked algorithm was compared with a state-of-the-art algorithm—extreme gradient boosting (XGBoost) to stratify the NPC patients into chance of survival groups. We validated our model with temporal validation (n = 547) and geographic external validation (Helsinki University Hospital NPC cohort, n = 60). The developed stacked predictive ML model showed an accuracy of 85.9% while the XGBoost had 84.5% after the training and testing phases. This demonstrated that both XGBoost and the stacked model showed comparable performance. External geographic validation of XGBoost model showed a c-index of 0.74, accuracy of 76.7%, and area under curve of 0.76. The SHAP technique revealed that age of the patient at diagnosis, T-stage, ethnicity, M-stage, marital status, and grade were among the prominent input variables in decreasing order of significance for the overall survival of NPC patients. LIME showed the degree of reliability of the prediction made by the model. In addition, both techniques showed how each feature contributed to the prediction made by the model. LIME and SHAP techniques provided personalized protective and risk factors for each NPC patient and unraveled some novel non-linear relationships between input features and survival chance. The examined ML approach showed the ability to predict the chance of overall survival of NPC patients. This is important for effective treatment planning care and informed clinical decisions. To enhance outcome results, including survival in NPC, ML may aid in planning individualized therapy for this patient population.
- Research Article
- 10.55041/ijsrem54806
- Dec 2, 2025
- International Journal of Scientific Research in Engineering and Management
- This experimental research presents a comprehensive evaluation of a Pima Indians Diabetes Dataset using a Support Vector Machine (SVM) classifier for predictive analysis [1], combined with explainable artificial intelligence (XAI) techniques to interpret model decisions. The dataset, collected from actual field conditions, was preprocessed and analyzed to identify significant features influencing the prediction outcomes, following standard practices in real-world machine learning pipelines [2]. The SVM model was trained and optimized to achieve high classification performance, demonstrating its robustness in handling nonlinear patterns and complex data distributions [3]. To enhance transparency and interpretability—critical aspects in modern machine learning applications [4]—two XAI frameworks, SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-Agnostic Explanations), were applied. SHAP was used to quantify global and local feature contributions, enabling a deeper understanding of how input variables impact the model’s decision boundaries [5]. LIME provided localized, instance-level explanations that highlighted the key attributes driving individual predictions [6]. The combined use of SVM with SHAP and LIME not only improved model interpretability but also strengthened trust in the predictive logic, making the approach suitable for deployment in sensitive and decision-critical environments ( [7] ). The results demonstrate that integrating XAI methods with traditional machine learning models can significantly enhance model transparency without compromising predictive performance, aligning with recent findings in the field [8]. This research could provide a helping hand to those who want to understand and implement XAI for various domains. Key Words: Include "Diabetes," "Explainable AI (XAI)," “LIME”, “SHAPLEY”
- Research Article
- 10.1016/j.ipha.2025.12.003
- Feb 1, 2026
- Intelligent Pharmacy
This study introduces CARE-Cirrhosis (Cirrhosis Ascites Risk Prediction and Explainability with Recommendation Engine), a unified methodological framework that systematically integrates predictive modeling, multi-level explainability, and a personalized recommendation engine into a single, deployable clinical decision-support architecture. Rather than applying interpretability tools in isolation, the framework embeds Explainable AI (XAI) methods like SHapley Additive exPlanations (SHAP), Local Interpretable Model agnostic Explanations (LIME), and counterfactual reasoning within an operational pipeline that transforms predictive outputs into transparent, actionable, patient-specific clinical guidance. Thus, it is advancing the methodological foundations of interpretable machine learning(ML) for biomedical applications. Data from the Mayo Clinic Primary Biliary Cirrhosis (PBC) cohort ( n = 418) were analyzed using Logistic Regression (LR), Random Forest (RF), and Extreme Gradient Boosting (XGB), evaluated under stratified 5-fold cross-validation(CV). Multi-level interpretability was achieved through XAI methods like global attribution (SHAP), local surrogate reasoning (LIME), and counterfactual analysis (DiCE). These layers were synthesized within a unified interpretability framework, linked to a rule-based recommendation engine for generating patient-specific, physiologically plausible “what-if” scenarios. The pipeline was implemented as a mobile application to demonstrate translational applicability and real-time deployment feasibility. All models demonstrated strong discriminative performance (AUROC 0.90–0.92). SHAP identified albumin, platelets, prothrombin, and edema as consistent global predictors, while counterfactual reasoning delineated clinically meaningful feature thresholds (probability 0.3–0.4). The interpretability synthesis enabled cross-validation of feature attributions across explanation paradigms, improving transparency and robustness. The integrated recommendation module generated individualized monitoring strategies and actionable insights. CARE-Cirrhosis establishes a generalizable methodological approach for unifying predictive modeling, explainability, and clinical recommendation within a single, deployable framework. By demonstrating a reproducible process for multi-level interpretability integration, it advances the methodological scope of biomedical informatics beyond model development toward transparent, interpretable, and actionable decision-support systems applicable across clinical domains. • Introduced CARE-Cirrhosis, a unified explainable AI (XAI) framework that integrates predictive modeling, multi-level interpretability XAI methods (SHAP, LIME, DiCE), and personalized recommendation into a single operational architecture. • Embedded interpretability within the model development workflow rather than as a post hoc step, achieving consistent global, local, and counterfactual transparency to enhance clinical trust. • Developed a hybrid recommendation engine that combines rule-based hepatology knowledge with model-driven feature gradients to generate physiologically plausible, patient-specific “what-if” guidance. • Achieved strong predictive performance (AUROC ≈ 0.90–0.92) across LR, RF, and XGB, with robustness confirmed through stratified cross-validation and synthetic external validation on perturbed datasets. • Deployed CARE-Cirrhosis as a mobile application integrating real-time risk scoring, interpretability visualization, and actionable recommendations into a clinician- and patient-friendly interface. • Designed the framework to be disease-agnostic and data schema–independent, enabling adaptation to other biomedical domains such as cardiology, oncology, and chronic disease management.
- Research Article
- 10.5815/ijieeb.2025.06.05
- Dec 8, 2025
- International Journal of Information Engineering and Electronic Business
Machine learning models that lack transparency can lead to biased conclusions and decisions in automated systems in various domains.To address this issue, explainable AI (XAI) frameworks such as Local Interpretable Model-Agnostic Explanations (LIME) and Shapley Additive Explanations (SHAP) have evolved by offering interpretable insights into machine learning model decisions.A thorough comparison of LIME and SHAP applied to a Random Forest model trained on a loan dataset resulted in an Accuracy of 85%, Precision of 84%, Recall of 97%, and an F1 score of 90%, is presented in this study.This study's primary contributions are as follows: (1) using Shapley values, which represent the contribution of each feature, to show that SHAP provides deeper and more reliable feature attributions than LIME; (2) demonstrating that LIME lacks the sophisticated interpretability of SHAP, despite offering faster and more generalizable explanations across various model types; (3) quantitatively comparing computational efficiency, where LIME displays a faster runtime of 0.1486 seconds using 9.14MB of memory compared to SHAP with a computational time of 0.3784 seconds using memory 1.2 MB.By highlighting the trade-offs between LIME and SHAP in terms of interpretability, computational complexity, and application to various computer systems, this study contributes to the field of XAI.The outcome helps stakeholders better understand and trust AI-driven loan choices, which advances the development of transparent and responsible AI systems in finance.
- Research Article
3
- 10.1080/00016489.2023.2301648
- Jan 26, 2024
- Acta Oto-Laryngologica
Background: The mortality rates of laryngeal squamous cell carcinoma cancer (LSCC) have not significantly decreased in the last decades. Objectives: We primarily aimed to compare the predictive performance of DeepTables with the state-of-the-art machine learning (ML) algorithms (Voting ensemble, Stack ensemble, and XGBoost) to stratify patients with LSCC into chance of overall survival (OS). In addition, we complemented the developed model by providing interpretability using both global and local model-agnostic techniques. Methods: A total of 2792 patients in the Surveillance, Epidemiology, and End Results (SEER) database diagnosed with LSCC were reviewed. The global model-agnostic interpretability was examined using SHapley Additive exPlanations (SHAP) technique. Likewise, individual interpretation of the prediction was made using Local Interpretable Model Agnostic Explanations (LIME). Results: The state-of-the-art ML ensemble algorithms outperformed DeepTables. Specifically, the examined ensemble algorithms showed comparable weighted area under receiving curve of 76.9, 76.8, and 76.1 with an accuracy of 71.2%, 70.2%, and 71.8%, respectively. The global methods of interpretability (SHAP) demonstrated that the age of the patient at diagnosis, N-stage, T-stage, tumor grade, and marital status are among the prominent parameters. Conclusions: A ML model for OS prediction may serve as an ancillary tool for treatment planning of LSCC patients.
- Conference Article
- 10.1109/icacrs67045.2025.11324170
- Dec 10, 2025
Explainable Artificial Intelligence (XAI) based tools such as SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) are extensively used in various detection and prediction approaches. These tools extract feature importance from the datasets and explain the contribution of the features (feature importance) towards detection /prediction output both locally and globally. In the current study a performance analysis is represented on the behaviour of LIME and SHAP explainability towards Denial-of-Service Attack detection in Internet of Things. There are numerous Black-box models including Machine Learning which show high detection accuracies in such case but the output is not interpretable by the security analyst most of the time. this drawback is overcome by introducing LIME and SHAP interpretability to the output of BlackBox model by analysing feature importance of the attack dataset towards detection accuracy. However, LIME and SHAPE has different behaviour towards model-interpretability. SHAP is powerful in global explanation where LIME works efficiently on local interpretation. We have shown that these two different tools perform on same detection accuracies of DoS attack using Machine learning model. A random forest classifier is first selected with high detection accuracy on a simulated DoS attack dataset and at the output SHAP and LIME are executed for achieving both local and global explainability. The comparison shows how SHAP and LIME show strength and weakness in explaining model’s behaviour both locally and globally.
- Research Article
34
- 10.1186/s12911-022-01817-6
- Mar 25, 2022
- BMC medical informatics and decision making
BackgroundMachine learning (ML) model is increasingly used to predict short-term outcome in critically ill patients, but the study for long-term outcome is sparse. We used explainable ML approach to establish 30-day, 90-day and 1-year mortality prediction model in critically ill ventilated patients.MethodsWe retrospectively included patients who were admitted to intensive care units during 2015–2018 at a tertiary hospital in central Taiwan and linked with the Taiwanese nationwide death registration data. Three ML models, including extreme gradient boosting (XGBoost), random forest (RF) and logistic regression (LR), were used to establish mortality prediction model. Furthermore, we used feature importance, Shapley Additive exPlanations (SHAP) plot, partial dependence plot (PDP), and local interpretable model-agnostic explanations (LIME) to explain the established model.ResultsWe enrolled 6994 patients and found the accuracy was similar among the three ML models, and the area under the curve value of using XGBoost to predict 30-day, 90-day and 1-year mortality were 0.858, 0.839 and 0.816, respectively. The calibration curve and decision curve analysis further demonstrated accuracy and applicability of models. SHAP summary plot and PDP plot illustrated the discriminative point of APACHE (acute physiology and chronic health exam) II score, haemoglobin and albumin to predict 1-year mortality. The application of LIME and SHAP force plots quantified the probability of 1-year mortality and algorithm of key features at individual patient level.ConclusionsWe used an explainable ML approach, mainly XGBoost, SHAP and LIME plots to establish an explainable 1-year mortality prediction ML model in critically ill ventilated patients.
- Research Article
- 10.1186/s12893-026-03746-x
- May 13, 2026
- BMC surgery
To identify risk factors for postoperative major complications after resection of primary liver cancer and to develop machine learning-based risk prediction models. We compared the predictive performance of multiple machine learning algorithms and evaluated the optimal model and its potential clinical utility. We retrospectively enrolled 2,389 patients who underwent resection of primary liver cancer at the First Affiliated Hospital of Xinjiang Medical University between January 2013 and December 2024. According to the Clavien-Dindo (CD) classification, patients with CD grade ≥ III were defined as having major postoperative complications (n = 447, while those with CD grade < III were classified as the non-complication group (n = 1,942). The dataset was divided into a training set(70%,n = 1,672)and a test set (30%,n = 717) using stratified sampling. Robust predictors were identified by taking the strict intersection of features selected by three methods: least absolute shrinkage and selection operator (LASSO) regression, XGBoost-based recursive feature elimination(RFE),and the random forest-based Boruta algorithm. Based on the selected features, seven machine learning models-logistic regression(LR),support vector machine(SVM),decision tree(DT), random forest(RF),extremely randomized trees(ET),extreme gradient boosting (XGBoost), and light gradient boosting machine(LightGBM)-were developed, with Bayesian optimization used for hyperparameter tuning. Model performance was comprehensively evaluated using the area under the receiver operating characteristic curve(AUC),sensitivity, specificity, calibration curves, Brier score, and decision curve analysis(DCA).The optimal model was further interpreted using SHapley Additive exPlanations (SHAP),local interpretable model-agnostic explanations(LIME), and partial dependence plots/individual conditional expectation(PDP/ICE). Eight key predictors were identified from the intersection of the three feature selection methods: surgical approach, alanine aminotransferase, intraoperative blood loss (IBL), liver stiffness measurement (LSM), prothrombin time (PT),total bilirubin, albumin (ALB), and intraoperative blood transfusion. Among the seven models, the RF model demonstrated the best overall performance in the test set, with an AUC of 0.843, accuracy of 0.851, specificity of 0.907, negative predictive value of 0.909, Brier score of 0.128, and F1 score of 0.602. SHAP analysis indicated that LSM, surgical approach, ALB, and IBL were the most influential predictors of major postoperative complications. DCA further showed that, across a wide range of threshold probabilities, RF-based risk stratification consistently provided greater net clinical benefit than either the treat-all or treat-none strategies. The RF model achieved the best predictive performance and can accurately estimate the risk of major postoperative complications after resection of primary liver cancer. This model may serve as a useful clinical decision-support tool for perioperative risk stratification and individualized patient management.
- Research Article
15
- 10.1186/s40001-024-01988-0
- Jul 25, 2024
- European Journal of Medical Research
BackgroundTuberculosis spondylitis (TS), commonly known as Pott’s disease, is a severe type of skeletal tuberculosis that typically requires surgical treatment. However, this treatment option has led to an increase in healthcare costs due to prolonged hospital stays (PLOS). Therefore, identifying risk factors associated with extended PLOS is necessary. In this research, we intended to develop an interpretable machine learning model that could predict extended PLOS, which can provide valuable insights for treatments and a web-based application was implemented.MethodsWe obtained patient data from the spine surgery department at our hospital. Extended postoperative length of stay (PLOS) refers to a hospitalization duration equal to or exceeding the 75th percentile following spine surgery. To identify relevant variables, we employed several approaches, such as the least absolute shrinkage and selection operator (LASSO), recursive feature elimination (RFE) based on support vector machine classification (SVC), correlation analysis, and permutation importance value. Several models using implemented and some of them are ensembled using soft voting techniques. Models were constructed using grid search with nested cross-validation. The performance of each algorithm was assessed through various metrics, including the AUC value (area under the curve of receiver operating characteristics) and the Brier Score. Model interpretation involved utilizing methods such as Shapley additive explanations (SHAP), the Gini Impurity Index, permutation importance, and local interpretable model-agnostic explanations (LIME). Furthermore, to facilitate the practical application of the model, a web-based interface was developed and deployed.ResultsThe study included a cohort of 580 patients and 11 features include (CRP, transfusions, infusion volume, blood loss, X-ray bone bridge, X-ray osteophyte, CT-vertebral destruction, CT-paravertebral abscess, MRI-paravertebral abscess, MRI-epidural abscess, postoperative drainage) were selected. Most of the classifiers showed better performance, where the XGBoost model has a higher AUC value (0.86) and lower Brier Score (0.126). The XGBoost model was chosen as the optimal model. The results obtained from the calibration and decision curve analysis (DCA) plots demonstrate that XGBoost has achieved promising performance. After conducting tenfold cross-validation, the XGBoost model demonstrated a mean AUC of 0.85 ± 0.09. SHAP and LIME were used to display the variables’ contributions to the predicted value. The stacked bar plots indicated that infusion volume was the primary contributor, as determined by Gini, permutation importance (PFI), and the LIME algorithm.ConclusionsOur methods not only effectively predicted extended PLOS but also identified risk factors that can be utilized for future treatments. The XGBoost model developed in this study is easily accessible through the deployed web application and can aid in clinical research.
- Research Article
16
- 10.47852/bonviewmedin52024744
- Mar 20, 2025
- Medinformatics
Liver disease is any condition that negatively affects the liver's function or structure, resulting in impaired liver function and various health complications. Abnormal conditions are rapidly increasing day by day. In this study, we used a dataset of key liver disease-related blood sample biomarkers to utilize various Machine learning (ML) techniques to enhance the accuracy of liver disease prediction. Specifically, we integrated the artificial neural network (ANN) model with five ML models: Stacked Generalization (Stacking), Bootstrap Aggregating (Bagging), Adaptive Boosting (AdaBoost), Gradient-Boosted Decision Tree (GBDT), and Support Vector Machine (SVM)—resulting in five distinct hybrid models: Stacking with ANN (SANN), Bagging with ANN, AdaBoost with ANN (ABANN), GBDT with ANN (GANN), and SVM with ANN (SVMANN). We tested all these hybrid models with feature selection techniques, including linear discriminant analysis (LDA), principal component analysis (PCA), recursive feature elimination (RFE), and also without feature selection. Through extensive testing, we found that these five hybrid models performed best when combined with LDA rather than PCA, RFE, or no feature selection. This discovery led us to create a max voting ensemble (MVE) of these LDA-optimized hybrid models. Remarkably, our prediction accuracy increased from 79.15% to 98.38% using the MVE. Furthermore, we employ Explainable Artificial Intelligence techniques such as Local Interpretable Model-agnostic Explanations, Shapley Additive Explanations, and Individual Conditional Expectations to analyze and enhance trust in the predictions. We also implemented 10-fold cross-validation to ensure the robustness and reliability of our results. This research underscores the significance of advancements in neural network systems and highlights the potential for hybrid models to improve predictive accuracy in liver disease diagnosis. Our findings pave the way for a new generation of computational technologies endowed with intelligence, ultimately contributing to better health outcomes and a deeper understanding of liver disease dynamics. Received: 6 November 2024 | Revised: 11 February 2025 | Accepted: 3 March 2025 Conflicts of Interest The authors declare that they have no conflicts of interest to this work. Data Availability Statement The data used in this study will be accessible upon request to the corresponding author. Author Contribution Statement Safiul Haque Chowdhury: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Resources, Data curation, Writing – original draft, Writing – review & editing, Visualization, Project administration. Mohammad Mamun: Formal analysis, Investigation, Resources, Data curation, Writing – original draft, Writing – review & editing, Visualization, Project administration. Md. Tanvir Ahmed Shaikat: Visualization, Project administration. Mohammed Ibrahim Hussain: Writing – review & editing, Supervision. MD. Sadiq Iqbal: Writing – review & editing, Visualization, Supervision. Muhammad Minoar Hossain: Writing – review & editing, Supervision.
- Research Article
1
- 10.1038/s41598-025-23593-9
- Nov 18, 2025
- Scientific reports
Cervical cancer, predominantly caused by Human Papillomavirus (HPV) infection, remains a significant global health burden for women, contributing to elevated morbidity and mortality rates. Early and accurate prediction is critical in improving patient outcomes and optimizing healthcare resource allocation. While machine learning (ML) and deep learning (DL) methods-such as support vector machines, random forests, and convolutional neural networks-have demonstrated promise in disease prediction, model interpretability, computational efficiency, and rely on large, labeled datasets. Additionally, conventional diagnostic methods like piezoresistive, piezoelectric, and optical lever techniques are often cost-prohibitive and complex, limiting widespread use. This study proposes a hybrid ML framework that integrates H2O AutoML with an autoencoder-based feature extraction and Fisher Score-based feature selection. To enhance model transparency and clinical trust, Local Interpretable Model-Agnostic Explanations (LIME) and SHAP (SHapley Additive exPlanations) are employed. The workflow initiates with exploratory data analysis (EDA) and dimensionality reduction using a stacked autoencoder, followed by selection of the top predictive features via Fisher Score. The refined feature set is used to train multiple models via H2O AutoML, with the best-performing deep learning model selected. On the training dataset, the selected model achieved 95.24% accuracy, an AUC of 98.10, and a log loss of 0.1747. Cross-validation confirms the model's robustness with consistent AUC and log loss values. At the optimal F1 threshold of 0.517, the confusion matrix indicates an error rate of 5.75% for actual negatives and 2.59% for actual positives, leading to an overall error rate of 4.14%. LIME and SHAP are used to interpret predictions at the instance level, providing actionable insights for clinicians. These results demonstrate the effectiveness of combining AutoML with explainable AI and advanced feature engineering to enhance the predictive power and interpretability of cervical cancer risk models, offering a scalable solution for clinical decision support.
- Conference Article
1
- 10.1145/3675888.3676060
- Aug 8, 2024
Millions of people in Uganda suffer from food insecurity as a result of the country’s scarce resources, extreme poverty, and high rate of childhood malnutrition. Women, who play a critical role in home management, are particularly vulnerable to food insecurity brought on by climate change. The vision is to foster a thriving agricultural sector in Uganda grounded in sustainability, with a mission to empower farmers, including women, through innovative tools and knowledge to adapt and prosper in a changing climate. This study employs cutting-edge technologies such as geospatial data analysis, machine learning, and environmental science to analyze weather patterns, soil conditions, and crop data. By providing personalized insights and recommendations to farmers, the research enables informed decision-making, optimizes resource management, and mitigates climate-related risks in agriculture. The research utilizes a multimodal architecture incorporating 13 traditional base machine learning models including Linear Regression, Lasso, Ridge, Elastic Net, Random Forest, K-Nearest Neighbors, Decision Tree, Support Vector Machine, Gaussian Process Regression, Gradient Boosting Regressor, XGBoost, LightGBM, and MLPRegressor. These base models are integrated into an ensemble approach using Stacking and Voting Regressors to harness collective intelligence and enhance predictive accuracy in optimizing agricultural productivity and resilience. The findings reveal that among the base models, LightGBM demonstrates superior performance in terms of mean squared error (MSE) of and R-squared (R2) metrics with MSE of 0.024 and R2 of 0.274, underscoring its effectiveness in predicting food security levels while the MLPRegressor demonstrates worst performance with MSE of 670.742 and R2 of -20636.457. The ensemble stacking model from the ensemble emerged as the best overall performer with MSE of 0.022 and R2 of 0.312, surpassing individual models and highlighting the significance of ensemble and multimodal learning approaches. These insights contribute to advancing precision agriculture and fostering resilience in farming communities, with implications for sustainable development and improved food security in Uganda. Furthermore, this study incorporates eXplainable Artificial Intelligence (XAI) techniques to interpret and visualize model predictions, providing farmers with transparent insights into the factors and key features influencing agricultural outcomes. XAI methods such as SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) are employed to enhance the transparency and trustworthiness of machine learning models, empowering farmers with actionable insights for sustainable farming practices.
- Research Article
4
- 10.3390/atmos15070788
- Jun 29, 2024
- Atmosphere
Climate change is causing permafrost in the Qinghai–Tibet Plateau to degrade, triggering thermokarst hazards and impacting the environment. Despite their ecological importance, the distribution and risks of thermokarst lakes are not well understood due to complex influencing factors. In this study, we introduced a new interpretable ensemble learning method designed to improve the global and local interpretation of susceptibility assessments for thermokarst lakes. Our primary aim was to offer scientific support for precisely evaluating areas prone to thermokarst lake formation. In the thermokarst lake susceptibility assessment, we identified ten conditioning factors related to the formation and distribution of thermokarst lakes. In this highly accurate stacking model, the primary learning units were the random forest (RF), extremely randomized trees (EXTs), extreme gradient boosting (XGBoost), and categorical boosting (CatBoost) algorithms. Meanwhile, gradient boosted decision trees (GBDTs) were employed as the secondary learning unit. Based on the stacking model, we assessed thermokarst lake susceptibility and validated accuracy through six evaluation indices. We examined the interpretability of the stacking model using three interpretation methods: accumulated local effects (ALE), local interpretable model-agnostic explanations (LIME), and Shapley additive explanations (SHAP). The results showed that the ensemble learning stacking model demonstrated superior performance and the highest prediction accuracy. Approximately 91.20% of the total thermokarst hazard points fell within the high and very high susceptible areas, encompassing 20.08% of the permafrost expanse in the QTP. The conclusive findings revealed that slope, elevation, the topographic wetness index (TWI), and precipitation were the primary factors influencing the assessment of thermokarst lake susceptibility. This comprehensive analysis extends to the broader impacts of thermokarst hazards, with the identified high and very high susceptibility zones affecting significant stretches of railway and highway infrastructure, substantial soil organic carbon reserves, and vast alpine grasslands. This interpretable ensemble learning model, which exhibits high accuracy, offers substantial practical significance for project route selection, construction, and operation in the QTP.
- Research Article
- 10.2139/ssrn.3901789
- Jan 1, 2021
- SSRN Electronic Journal
A Machine Learning Model Based on Electronic Health Records for Predicting Recurrence after Microwave Ablation of Hepatocellular Carcinoma
- Research Article
- 10.2196/82587
- Apr 16, 2026
- JMIR formative research
Prostate cancer progression exhibits significant variability influenced by biological and racial factors. DNA methylation profiling has shown potential in early cancer detection, but its integration with machine learning across racially diverse populations remains limited. This study aimed to develop a prostate cancer stage classifier for the majority White cohort using DNA methylation data and a multilayer perceptron (MLP) model in order to classify prostate cancer stages into early (stages I-II) and late (stages III-IV) stages and assess its performance when applied to other racial groups to highlight the need for race-specific models. Methylation and phenotype data from the TCGA-PRAD (The Cancer Genome Atlas Prostate Adenocarcinoma) dataset were processed using differentially methylated position (DMP) analysis to identify CpG sites correlated with cancer stages. These features were further refined through recursive feature elimination (RFE) and used to train MLP models. Shapley Additive Explanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME) were used to interpret the model and identify key DNA methylation features contributing to model predictions. The best-performing model achieved 95% accuracy and up to 99% area under the curve on the majority race (White) training data using 90 selected features. However, performance declined sharply in racial minority groups, revealing the effects of sample imbalance and race-specific methylation patterns. Feature importance examination indicated strong patterns within certain CpG sites driving model predictions. We propose a race-aware MLP model for prostate cancer stage classification using DNA methylation data, which has been optimized through DMP and RFE-based feature selection. SHAP and LIME confirmed the predictive relevance of selected CpG sites, supporting model transparency. The results highlight high performance within the White cohort but reveal poor generalization to racial minority groups, emphasizing the importance of race-specific modeling strategies.