Interpretable Machine Learning for Identifying Key Variables Influencing Gold Recovery and Grade
Gold flotation performance is influenced by multiple interacting variables, yet most predictive studies in this area emphasize accuracy while neglecting interpretability, limiting their practical value for process engineers. This study applies explainable machine learning techniques to identify and interpret key variables, controlling cumulative gold recovery and grade using a small, experimentally derived dataset (n = 11) from Ballarat gold ore flotation. A Gradient Boosting Regressor, combined with SHAP (Shapley Additive Explanations), permutation importance, and feature importance analyses, was employed to uncover both linear and non-linear relationships. Power, head grade, and processing time consistently emerged as dominant predictors, while interaction effects (e.g., head grade × collector, size × head grade) provided additional explanatory insights. The findings reveal actionable process implications, including trade-offs between energy input and flotation efficiency, and highlight operational conditions for improved recovery and grade. This study demonstrates that interpretable machine learning can bridge the gap between statistical modeling and process optimization, delivering transparent, domain-specific insights even in data-constrained environments.
7
- 10.1080/08827508.2019.1635475
- Jul 29, 2019
- Mineral Processing and Extractive Metallurgy Review
- 10.48550/arxiv.2309.07593
- Sep 14, 2023
1
- 10.48550/arxiv.2305.10696
- May 18, 2023
1
- 10.48550/arxiv.2405.11766
- May 20, 2024
- 10.3390/ma18153553
- Jul 29, 2025
- Materials (Basel, Switzerland)
2
- 10.1016/j.engappai.2025.110409
- May 1, 2025
- Engineering Applications of Artificial Intelligence
1
- 10.37190/ppmp/150264
- May 20, 2022
- Physicochemical Problems of Mineral Processing
102
- 10.1016/s0892-6875(01)00103-0
- Sep 1, 2001
- Minerals Engineering
- 10.48550/arxiv.2411.17201
- Nov 26, 2024
2
- 10.3390/min13091230
- Sep 19, 2023
- Minerals
- Research Article
13
- 10.1108/ijhma-11-2022-0172
- Apr 13, 2023
- International Journal of Housing Markets and Analysis
PurposeThe purpose is twofold. First, this study aims to establish that black box tree-based machine learning (ML) models have better predictive performance than a standard linear regression (LR) hedonic model for rent prediction. Second, it shows the added value of analyzing tree-based ML models with interpretable machine learning (IML) techniques.Design/methodology/approachData on Belgian residential rental properties were collected. Tree-based ML models, random forest regression and eXtreme gradient boosting regression were applied to derive rent prediction models to compare predictive performance with a LR model. Interpretations of the tree-based models regarding important factors in predicting rent were made using SHapley Additive exPlanations (SHAP) feature importance (FI) plots and SHAP summary plots.FindingsResults indicate that tree-based models perform better than a LR model for Belgian residential rent prediction. The SHAP FI plots agree that asking price, cadastral income, surface livable, number of bedrooms, number of bathrooms and variables measuring the proximity to points of interest are dominant predictors. The direction of relationships between rent and its factors is determined with SHAP summary plots. In addition to linear relationships, it emerges that nonlinear relationships exist.Originality/valueRent prediction using ML is relatively less studied than house price prediction. In addition, studying prediction models using IML techniques is relatively new in real estate economics. Moreover, to the best of the authors’ knowledge, this study is the first to derive insights of driving determinants of predicted rents from SHAP FI and SHAP summary plots.
- Research Article
3
- 10.1016/j.ecoenv.2024.117570
- Jan 1, 2025
- Ecotoxicology and environmental safety
Predicting the risk of cardiovascular disease in adults exposed to heavy metals: Interpretable machine learning.
- Research Article
44
- 10.1016/j.aap.2022.106617
- Feb 21, 2022
- Accident Analysis & Prevention
On the interpretability of machine learning methods in crash frequency modeling and crash modification factor development
- Research Article
6
- 10.1016/j.jece.2023.110847
- Aug 23, 2023
- Journal of Environmental Chemical Engineering
Assessment of organic micropollutants rejection by forward osmosis system using interpretable machine learning-assisted approach: A new perspective on optimization of multifactorial forward osmosis process
- Research Article
8
- 10.1093/mnras/stad2814
- Sep 14, 2023
- Monthly Notices of the Royal Astronomical Society
Astrochemical modelling of the interstellar medium typically makes use of complex computational codes with parameters whose values can be varied. It is not always clear what the exact nature of the relationship is between these input parameters and the output molecular abundances. In this work, a feature importance analysis is conducted using SHapley Additive exPlanations (SHAP), an interpretable machine learning technique, to identify the most important physical parameters as well as their relationship with each output. The outputs are the abundances of species and ratios of abundances. In order to reduce the time taken for this process, a neural network emulator is trained to model each species’ output abundance and this emulator is used to perform the interpretable machine learning. SHAP is then used to further explore the relationship between the physical features and the abundances for the various species and ratios we considered. H2O and CO’s gas phase abundances are found to strongly depend on the metallicity. NH3 has a strong temperature dependence, with there being two temperature regimes (<100 K and >100 K). By analysing the chemical network, we relate this to the chemical reactions in our network and find the increased temperature results in increased efficiency of destruction pathways. We investigate the HCN/HNC ratio and show that it can be used as a cosmic thermometer, agreeing with the literature. This ratio is also found to be correlated with the metallicity. The HCN/CS ratio serves as a density tracer, but also has three separate temperature-dependence regimes, which are linked to the chemistry of the two molecules.
- Research Article
- 10.2196/75121
- Oct 1, 2025
- JMIR Formative Research
BackgroundSurgical site infections (SSIs) are one of the most common health care–associated infections, accounting for nearly 20% of all health care–associated infections in hospitalized patients. SSIs are associated with longer hospital stays, increased readmission rates, higher health care costs, and a mortality rate twice that of patients without infections.ObjectiveThis study aimed to develop and evaluate machine learning (ML) models for augmenting SSI surveillance after colon surgery with the goal of improving the efficiency of infection control practices by prioritizing patients at high risk.MethodsWe conducted a retrospective study using data from 1508 patients undergoing colon surgery treated between 2018 and 2023 at a single academic medical center. Of these 1508 patients, 66 (4.4%) developed SSIs as adjudicated by infection control practitioners following Centers for Disease Control and Prevention National Healthcare Safety Network criteria. Data included 78 structured variables (eg, demographics, comorbidities, vital signs, laboratory tests, medications, and operative details) and 2 features derived from unstructured clinical notes using natural language processing. ML models<strong>―</strong>logistic regression, random forest, and Extreme Gradient Boosting (XGBoost)<strong>―</strong>were trained using stratified 80/20 train-test splits. Class imbalance was addressed using cost-sensitive learning and the synthetic minority oversampling technique. Model performance was evaluated using precision, recall, F1-score, area under the receiver operating characteristic curve, and Brier scores for calibration.ResultsOf the 1508 patients, those who developed SSIs had longer hospital stays (mean 8.1, SD 6.8 days vs mean 6.3, SD 10.5 days; P<.001), higher rates of an American Society of Anesthesiologists score of 3 (52/66, 79% vs 653/1442, 45.3%; P<.001), and elevated white blood cell counts (51/66, 77% vs 734/1442, 50.9%; P<.001). XGBoost achieved the best overall performance with an area under the receiver operating characteristic curve of 0.788, precision of 50%, recall of 38%, and Brier score of 0.035. Random forest yielded perfect precision (100%) but lower recall (23%), with a Brier score of 0.034. Logistic regression showed the highest recall (46%) but the lowest precision (10%), with a Brier score of 0.139. Feature importance analysis using Shapley additive explanations (SHAP) values revealed that the top predictors included recovery duration (SHAP=1.18), SSI keyword frequency (SHAP=1.12), patient age (SHAP=1.12), and American Society of Anesthesiologists score (SHAP=0.94), with natural language processing–derived features ranking among the top 10.ConclusionsML models can augment traditional SSI surveillance by improving early identification of patients at high risk. The XGBoost model offered the best trade-off between discrimination and calibration, suggesting its utility in clinical workflows. Incorporating structured and unstructured electronic health record data enhances model accuracy and clinical relevance, supporting scalable and efficient infection control practices.
- Research Article
76
- 10.1016/j.ejor.2023.06.036
- Jun 23, 2023
- European Journal of Operational Research
The class imbalance problem is common in the credit scoring domain, as the number of defaulters is usually much less than the number of non-defaulters. To date, research on investigating the class imbalance problem has mainly focused on indicating and reducing the adverse effect of the class imbalance on the predictive accuracy of machine learning techniques, while the impact of that on machine learning interpretability has never been studied in the literature. This paper fills this gap by analysing how the stability of Local Interpretable Model-agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP), two popular interpretation methods, are affected by class imbalance. Our experiments use 2016–2020 UK residential mortgage data collected from European Datawarehouse. We evaluate the stability of LIME and SHAP on datasets of progressively increased class imbalance. The results show that interpretations generated from LIME and SHAP are less stable as the class imbalance increases, which indicates that the class imbalance does have an adverse effect on machine learning interpretability. To check the robustness of our outcomes, we also analyse two open-source credit scoring datasets and we obtain similar results.
- Research Article
- 10.1016/j.acra.2025.04.068
- May 1, 2025
- Academic radiology
Right Ventricular Strain as a Key Feature in Interpretable Machine Learning for Identification of Takotsubo Syndrome: A Multicenter CMR-based Study.
- Research Article
9
- 10.3389/fmed.2024.1399848
- May 17, 2024
- Frontiers in medicine
Delirium is the most common neuropsychological complication among older adults admitted to the intensive care unit (ICU) and is often associated with a poor prognosis. This study aimed to construct and validate an interpretable machine learning (ML) for early delirium prediction in older ICU patients. This was a retrospective observational cohort study and patient data were extracted from the Medical Information Mart for Intensive Care-IV database. Feature variables associated with delirium, including predisposing factors, disease-related factors, and iatrogenic and environmental factors, were selected using least absolute shrinkage and selection operator regression, and prediction models were built using logistic regression, decision trees, support vector machines, extreme gradient boosting (XGBoost), k-nearest neighbors and naive Bayes methods. Multiple metrics were used for evaluation of performance of the models, including the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, recall, F1 score, calibration plot, and decision curve analysis. SHapley Additive exPlanations (SHAP) were used to improve the interpretability of the final model. Nine thousand seven hundred forty-eight adults aged 65 years or older were included for analysis. Twenty-six features were selected to construct ML prediction models. Among the models compared, the XGBoost model demonstrated the best performance including the highest AUC (0.836), accuracy (0.765), sensitivity (0.713), recall (0.713), and F1 score (0.725) in the training set. It also exhibited excellent discrimination with AUC of 0.810, good calibration, and had the highest net benefit in the validation cohort. The SHAP summary analysis showed that Glasgow Coma Scale, mechanical ventilation, and sedation were the top three risk features for outcome prediction. The SHAP dependency plot and SHAP force analysis interpreted the model at both the factor level and individual level, respectively. ML is a reliable tool for predicting the risk of critical delirium in elderly patients. By combining XGBoost and SHAP, it can provide clear explanations for personalized risk prediction and more intuitive understanding of the effect of key features in the model. The establishment of such a model would facilitate the early risk assessment and prompt intervention for delirium.
- Research Article
- 10.1186/s13018-025-05901-1
- May 24, 2025
- Journal of Orthopaedic Surgery and Research
ObjectivesTo develop and validate an interpretable machine learning model based on clinicoradiological features and radiomic features based on magnetic resonance imaging (MRI) to predict the failure of conservative treatment in lateral epicondylitis (LE).MethodsThis retrospective study included 420 patients with LE from three hospitals, divided into a training cohort (n = 245), an internal validation cohort (n = 115), and an external validation cohort (n = 60). Patients were categorized into conservative treatment failure (n = 133) and conservative treatment success (n = 287) groups based on the outcome of conservative treatment. We developed two predictive models: one utilizing clinicoradiological features, and another integrating clinicoradiological and radiomic features. Seven machine learning algorithms were evaluated to determine the optimal model for predicting the failure of conservative treatment. Model performance was assessed using ROC, and model interpretability was examined using SHapley Additive exPlanations (SHAP).ResultsThe LightGBM algorithm was selected as the optimal model because of its superior performance. The combined model demonstrated enhanced predictive accuracy with an area under the ROC curve (AUC) of 0.96 (95% CI: 0.91, 0.99) in the external validation cohort. SHAP analysis identified the radiological feature “CET coronal tear size” and the radiomic feature “AX_log-sigma-1-0-mm-3D_glszm_SmallAreaEmphasis” as key predictors of conservative treatment failure.ConclusionsWe developed and validated an interpretable LightGBM machine learning model that integrates clinicoradiological and radiomic features to predict the failure of conservative treatment in LE. The model demonstrates high predictive accuracy and offers valuable insights into key prognostic factors.
- Research Article
61
- 10.1167/tvst.9.2.8
- Feb 12, 2020
- Translational Vision Science & Technology
PurposeRecently, laser refractive surgery options, including laser epithelial keratomileusis, laser in situ keratomileusis, and small incision lenticule extraction, successfully improved patients’ quality of life. Evidence-based recommendation for an optimal surgery technique is valuable in increasing patient satisfaction. We developed an interpretable multiclass machine learning model that selects the laser surgery option on the expert level.MethodsA multiclass XGBoost model was constructed to classify patients into four categories including laser epithelial keratomileusis, laser in situ keratomileusis, small incision lenticule extraction, and contraindication groups. The analysis included 18,480 subjects who intended to undergo refractive surgery at the B&VIIT Eye center. Training (n = 10,561) and internal validation (n = 2640) were performed using subjects who visited between 2016 and 2017. The model was trained based on clinical decisions of highly experienced experts and ophthalmic measurements. External validation (n = 5279) was conducted using subjects who visited in 2018. The SHapley Additive ex-Planations technique was adopted to explain the output of the XGBoost model.ResultsThe multiclass XGBoost model exhibited an accuracy of 81.0% and 78.9% when tested on the internal and external validation datasets, respectively. The SHapley Additive ex-Planations explanations for the results were consistent with prior knowledge from ophthalmologists. The explanation from one-versus-one and one-versus-rest XGBoost classifiers was effective for easily understanding users in the multicategorical classification problem.ConclusionsThis study suggests an expert-level multiclass machine learning model for selecting the refractive surgery for patients. It also provided a clinical understanding in a multiclass problem based on an explainable artificial intelligence technique.Translational RelevanceExplainable machine learning exhibits a promising future for increasing the practical use of artificial intelligence in ophthalmic clinics.
- Research Article
- 10.1186/s40001-024-02005-0
- Aug 31, 2024
- European Journal of Medical Research
IntroductionThis study aims to construct a mortality prediction model for patients with non-variceal upper gastrointestinal bleeding (NVUGIB) in the intensive care unit (ICU), employing advanced machine learning algorithms. The goal is to identify high-risk populations early, contributing to a deeper understanding of patients with NVUGIB in the ICU.MethodsWe extracted NVUGIB data from the Medical Information Mart for Intensive Care IV (MIMIC-IV, v.2.2) database spanning from 2008 to 2019. Feature selection was conducted through LASSO regression, followed by training models using 11 machine learning methods. The best model was chosen based on the area under the curve (AUC). Subsequently, Shapley additive explanations (SHAP) was employed to elucidate how each factor influenced the model. Finally, a case was randomly selected, and the model was utilized to predict its mortality, demonstrating the practical application of the developed model.ResultsIn total, 2716 patients with NVUGIB were deemed eligible for participation. Following selection, 30 out of a total of 64 clinical parameters collected on day 1 after ICU admission remained associated with prognosis and were utilized for developing machine learning models. Among the 11 constructed models, the Gradient Boosting Decision Tree (GBDT) model demonstrated the best performance, achieving an AUC of 0.853 and an accuracy of 0.839 in the validation cohort. Feature importance analysis highlighted that shock, Glasgow Coma Scale (GCS), renal disease, age, albumin, and alanine aminotransferase (ALP) were the top six features of the GBDT model with the most significant impact. Furthermore, SHAP force analysis illustrated how the constructed model visualized the individualized prediction of death.ConclusionsPatient data from the MIMIC database were leveraged to develop a robust prognostic model for patients with NVUGIB in the ICU. The analysis using SHAP also assisted clinicians in gaining a deeper understanding of the disease.
- Research Article
1
- 10.3389/fcvm.2025.1444323
- Jan 24, 2025
- Frontiers in cardiovascular medicine
Early prediction of heart failure (HF) after acute myocardial infarction (AMI) is essential for personalized treatment. We aimed to use interpretable machine learning (ML) methods to develop a risk prediction model for HF in AMI patients. We retrospectively included patients initially with AMI who received percutaneous coronary intervention (PCI) in our hospital from November 2016 to February 2020. The primary endpoint was the occurrence of HF within 3 years after operation. For developing a predictive model for HF risk in AMI patients, the least absolute shrinkage and selection operator (LASSO) Regression was used to feature selection, and four ML algorithms including Random Forest (RF), Extreme Gradient Boost (XGBoost), Support Vector Machine (SVM), and Logistic Regression (LR) were employed to develop the model on the training set. The performance evaluation of the prediction model was carried out on the training set and the testing set, utilizing metrics including AUC (Area under the receiver operating characteristic curve), calibration plot, and decision curve analysis (DCA). In addition, we used the Shapley Additive Explanations (SHAP) value to determine the importance of the selected features and interpret the optimal model. A total of 1220 AMI patients were included and 244 (20%) patients developed HF during follow-up. Among the four evaluated ML models, the XGBoost model exhibited exceptional accuracy, with an AUC value of 0.922. The SHAP method showed that left ventricular ejection fraction (LVEF), left ventricular end-systolic diameter (LVDs) and lactate dehydrogenase (LDH) were identified as the three most important characteristics to predict HF risk in AMI patients. Individual risk assessment was performed using SHAP plots and waterfall plot analysis. Our research demonstrates the potential of ML methods in the early prediction of HF risk in AMI patients. Furthermore, it enhances the interpretability of the XGBoost model through SHAP analysis to guide clinical decision-making.
- Research Article
- 10.2196/71539
- Sep 16, 2025
- JMIR Medical Informatics
BackgroundGestational diabetes mellitus (GDM) affects over 5% of pregnancies worldwide, elevating risks of type 2 diabetes post partum and complications such as fetal death, miscarriage, and congenital abnormalities. Effective GDM management is essential to balance glycemic control and pregnancy outcomes.ObjectiveWe aim to develop interpretable machine learning models using GDM datasets for predicting adverse pregnancy outcomes and identifying key factors through the Shapley additive explanations (SHAP) algorithm, thus supporting improved maternal and infant health.MethodsData preprocessing and feature selection were performed, with adaptive synthetic sampling used to address class imbalance. Classification models, including logistic regression, random forest, support vector machine, and extreme gradient boosting, were built and enhanced through the stacking method. Model interpretability was assessed with SHAP to quantify feature contributions.ResultsAmong 1670 patients, 200 experienced adverse outcomes. The stacking model outperformed individual models, achieving an accuracy of 85.6%, a sensitivity of 57.8%, a specificity of 95.9%, and an area under the receiver operating characteristic curve of 0.82 on the test set. External validation on 159 patients showed a decline in performance (accuracy 83.6%, area under the receiver operating characteristic curve 0.67). SHAP analysis identified gestational age, glucose control, and diagnosis time among the most influential predictors, providing clinically meaningful insights into risk factors. Additionally, detailed SHAP-based visualization revealed the distribution of different feature values and their nonlinear impact on outcomes, as well as interaction effects between features. These interpretable analyses enabled a deeper understanding of individual and combined feature contributions, thereby enhancing clinical assessment capabilities.ConclusionsThis study underscores the potential of machine learning in predicting adverse outcomes in GDM, with interpretable features offering valuable clinical insights to enhance pregnancy management and maternal-infant health.
- Research Article
- 10.3390/app15084226
- Apr 11, 2025
- Applied Sciences
This study sought to establish machine learning models for forecasting in-hospital mortality in non-ST-segment elevation myocardial infarction (NSTEMI) patients, and focused on model interpretability using Shapley additive explanations (SHAP). Data were gathered from the Medical Information Mart for Intensive Care—IV database. The synthetic minority over-sampling technique and Edited Nearest Neighbors were used to address class imbalance. Four machine learning algorithms were employed, including Adaptive Boosting (AdaBoost), Random Forest (RF), Gradient Boosting Decision Trees (GBDT), and eXtreme Gradient Boosting (XGBoost). SHAP was utilized to improve transparency and credibility. The all-features RF model demonstrated optimal performance, with an accuracy of 0.8513, precision of 0.9016, and AUC of 0.8903. The SHAP summary plot for the RF model revealed that Acute Physiology Score III, lactate dehydrogenase, and lactate were the three most crucial characteristics, with higher values indicating a greater risk. The study demonstrates the applicability of machine learning, particularly RF, in predicting in-hospital mortality for NSTEMI patients, with the use of SHAP enhancing model interpretability and providing clinicians with clearer insights into feature contributions.
- New
- Research Article
- 10.3390/ma18215054
- Nov 6, 2025
- Materials
- New
- Research Article
- 10.3390/ma18215055
- Nov 6, 2025
- Materials
- New
- Research Article
- 10.3390/ma18215049
- Nov 6, 2025
- Materials
- New
- Research Article
- 10.3390/ma18215050
- Nov 6, 2025
- Materials
- New
- Research Article
- 10.3390/ma18215052
- Nov 6, 2025
- Materials
- New
- Research Article
- 10.3390/ma18215053
- Nov 6, 2025
- Materials
- New
- Research Article
- 10.3390/ma18215056
- Nov 6, 2025
- Materials
- New
- Research Article
- 10.3390/ma18215042
- Nov 5, 2025
- Materials
- New
- Research Article
- 10.3390/ma18215038
- Nov 5, 2025
- Materials
- New
- Research Article
- 10.3390/ma18215033
- Nov 5, 2025
- Materials
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.