• All Solutions All Solutions Caret
    • Editage

      One platform for all researcher needs

    • Paperpal

      AI-powered academic writing assistant

    • R Discovery

      Your #1 AI companion for literature search

    • Mind the Graph

      AI tool for graphics, illustrations, and artwork

    • Journal finder

      AI-powered journal recommender

    Unlock unlimited use of all AI tools with the Editage Plus membership.

    Explore Editage Plus
  • Support All Solutions Support
    discovery@researcher.life
Discovery Logo
Sign In
Paper
Search Paper
Cancel
Pricing Sign In
  • My Feed iconMy Feed
  • Search Papers iconSearch Papers
  • Library iconLibrary
  • Explore iconExplore
  • Ask R Discovery iconAsk R Discovery Star Left icon
  • Chat PDF iconChat PDF Star Left icon
  • Chrome Extension iconChrome Extension
    External link
  • Use on ChatGPT iconUse on ChatGPT
    External link
  • iOS App iconiOS App
    External link
  • Android App iconAndroid App
    External link
  • Contact Us iconContact Us
    External link
Discovery Logo menuClose menu
  • My Feed iconMy Feed
  • Search Papers iconSearch Papers
  • Library iconLibrary
  • Explore iconExplore
  • Ask R Discovery iconAsk R Discovery Star Left icon
  • Chat PDF iconChat PDF Star Left icon
  • Chrome Extension iconChrome Extension
    External link
  • Use on ChatGPT iconUse on ChatGPT
    External link
  • iOS App iconiOS App
    External link
  • Android App iconAndroid App
    External link
  • Contact Us iconContact Us
    External link

Articles published on partial-dependence-plots

Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
945 Search results
Sort by
Recency
  • Research Article
  • 10.1016/j.jenvman.2025.126252
Machine learning approaches for predicting antibiotic resistance genes abundance changes during biological nitrogen removal process.
  • Aug 1, 2025
  • Journal of environmental management
  • Tianyi Lu + 4 more

Machine learning approaches for predicting antibiotic resistance genes abundance changes during biological nitrogen removal process.

  • Research Article
  • 10.1016/j.jenvman.2025.126063
The grey water footprint of the Guangdong-Hong Kong-Macao Greater Bay Area, China: Spatial patterns, driving mechanism and implications.
  • Aug 1, 2025
  • Journal of environmental management
  • Binfen Liu + 3 more

The grey water footprint of the Guangdong-Hong Kong-Macao Greater Bay Area, China: Spatial patterns, driving mechanism and implications.

  • Research Article
  • 10.1123/ijspp.2024-0486
Race-Performance Parameters Differentiating World-Best From National-Level Swimmers: A Race Video Analysis and Machine-Learning Approach.
  • Aug 1, 2025
  • International journal of sports physiology and performance
  • Giovanni L Postiglione + 6 more

Elite swimming performance is determined by a complex interplay of anthropometric, physiological, biomechanical, and technical factors. Previous research highlights how the 100-m freestyle demands explosive power, technical proficiency, and tactical acumen, yet factors that distinguish world-class swimmers from their closely performing (inter)national-level counterparts remain elusive. To identify race-performance factors differentiating world-class swimmers in the 100-m freestyle. World-best to national-level (N = 204) male swimmers competing at long-course events between 2019 and 2024 were analyzed using high-definition video and race-analysis software. Key performance metrics including stroke rate and length, turn efficiency, underwater phase duration, and velocity at 5-m intervals were extracted. Using a machine-learning random forest algorithm, the most salient factors distinguishing between world-class (0%-2.5% off world record), international-level (2.5%-5% off), and national-level (5%-10% off) performance categories were identified. Analyses revealed a model classification accuracy of 89.5% with swim velocities at 65- to 70- and 70- to 75-m race segments most strongly associated with performance-level differentiation. These 2 race segments scored twice as high as all the other top 10 features. Shapley additive explanations (SHAP) analysis confirmed the importance of midrace velocities, while partial dependence plots identified the necessary velocity range values likely associated with national- to world-class performance levels. The combination of race analysis and machine learning creates the opportunity for targeted intervention for coaches and sport scientists working with high-performing 100-m male swimmers.

  • Research Article
  • 10.3390/su17156983
What Determines Carbon Emissions of Multimodal Travel? Insights from Interpretable Machine Learning on Mobility Trajectory Data
  • Jul 31, 2025
  • Sustainability
  • Guo Wang + 3 more

Understanding the carbon emissions of multimodal travel—comprising walking, metro, bus, cycling, and ride-hailing—is essential for promoting sustainable urban mobility. However, most existing studies focus on single-mode travel, while underlying spatiotemporal and behavioral determinants remain insufficiently explored due to the lack of fine-grained data and interpretable analytical frameworks. This study proposes a novel integration of high-frequency, real-world mobility trajectory data with interpretable machine learning to systematically identify the key drivers of carbon emissions at the individual trip level. Firstly, multimodal travel chains are reconstructed using continuous GPS trajectory data collected in Beijing. Secondly, a model based on Calculate Emissions from Road Transport (COPERT) is developed to quantify trip-level CO2 emissions. Thirdly, four interpretable machine learning models based on gradient boosting—XGBoost, GBDT, LightGBM, and CatBoost—are trained using transportation and built environment features to model the relationship between CO2 emissions and a set of explanatory variables; finally, Shapley Additive exPlanations (SHAP) and partial dependence plots (PDPs) are used to interpret the model outputs, revealing key determinants and their non-linear interaction effects. The results show that transportation-related features account for 75.1% of the explained variance in emissions, with bus usage being the most influential single factor (contributing 22.6%). Built environment features explain the remaining 24.9%. The PDP analysis reveals that substantial emission reductions occur only when the shares of bus, metro, and cycling surpass threshold levels of approximately 40%, 40%, and 30%, respectively. Additionally, travel carbon emissions are minimized when trip origins and destinations are located within a 10 to 11 km radius of the central business district (CBD). This study advances the field by establishing a scalable, interpretable, and behaviorally grounded framework to assess carbon emissions from multimodal travel, providing actionable insights for low-carbon transport planning and policy design.

  • Research Article
  • 10.1017/psy.2025.10032
Explaining Person-by-Item Responses using Person- and Item-Level Predictors via Random Forests and Interpretable Machine Learning in Explanatory Item Response Models.
  • Jul 31, 2025
  • Psychometrika
  • Sun-Joo Cho + 3 more

This study incorporates a random forest (RF) approach to probe complex interactions and nonlinearity among predictors into an item response model with the goal of using a hybrid approach to outperform either an RF or explanatory item response model (EIRM) only in explaining item responses. In the specified model, called EIRM-RF, predicted values using RF are added as a predictor in EIRM to model the nonlinear and interaction effects of person- and item-level predictors in person-by-item response data, while accounting for random effects over persons and items. The results of the EIRM-RF are probed with interpretable machine learning (ML) methods, including feature importance measures, partial dependence plots, accumulated local effect plots, and the H-statistic. The EIRM-RF and the interpretable methods are illustrated using an empirical data set to explain differences in reading comprehension in digital versus paper mediums, and the results of EIRM-RF are compared with those of EIRM and RF to show empirical differences in modeling the effects of predictors and random effects among EIRM, RF, and EIRM-RF. In addition, simulation studies are conducted to compare model accuracy among the three models and to evaluate the performance of interpretable ML methods.

  • Research Article
  • 10.1037/met0000772
An explainable artificial intelligence handbook for psychologists: Methods, opportunities, and challenges.
  • Jul 31, 2025
  • Psychological methods
  • Rosa Lavelle-Hill + 3 more

With more researchers in psychology using machine learning to model large data sets, many are also looking to eXplainable artificial intelligence (XAI) methods to understand how their model works and to gain insights into the most important predictors. However, the methodological approach for establishing predictor importance in a machine learning model is not as straightforward or as well-established as with traditional statistical models. Not only are there a large number of potential XAI methods to choose from, but there are also a number of unresolved challenges when using XAI to understand psychological data. This article aims to provide an introduction to the field of XAI for psychologists. We first introduce explainability from an applied machine learning perspective and contrast it to that in psychology. Then we provide an overview of commonly used XAI approaches, namely permutation importance, impurity-based feature importance, individual conditional expectation graphs, partial dependence plots, accumulated local effect graphs, Local Interpretable Model-agnostic Explanations (LIME), SHapley Additive exPlanations (SHAP), and Deep Learning Important FeaTures (DeepLIFT). Finally, we demonstrate the impact of multicollinearity on different XAI methods using a simulation analysis and discuss the implementation challenges and future directions in psychological research. (PsycInfo Database Record (c) 2025 APA, all rights reserved).

  • Research Article
  • 10.54254/2753-7048/2025.nd25528
Attitudes Toward Immigration among Secondary School Students in Various Countries: An International Perspective
  • Jul 30, 2025
  • Lecture Notes in Education Psychology and Public Media
  • Yiwen Cao

This study investigates secondary school students attitudes toward immigration across multiple countries, utilizing data from the Programme for International Student Assessment (PISA 2018). A cross-national comparative approach combined with advanced dimensionality reduction identifies the strongest factors shaping openness toward immigrants. Results show that international attention encourages welcoming views up to a critical threshold, after which further exposure predicts rising wariness. Partial dependence plots and sensitivity tests confirm the nonlinear nature of this relationship and the stability of the broader pattern. Country differences are further conditioned by socioeconomic context, classroom ethnic diversity, and students global competence. Respect for cultural difference, balanced curiosity about world affairs, and reflective self-judgement consistently support positive attitudes, whereas frequent exposure to conflict framed news undermines them. The findings signal an urgent need for inclusive curricula, critical media literacy, stronger digital information skills, and authentic, sustained intercultural encounters to nurture empathetic, well-informed citizens in an era of intensifying global mobility and to guide evidence-based equitable education policy.

  • Research Article
  • 10.3390/app15158449
Development and Clinical Interpretation of an Explainable AI Model for Predicting Patient Pathways in the Emergency Department: A Retrospective Study
  • Jul 30, 2025
  • Applied Sciences
  • Émilien Arnaud + 6 more

Background: Overcrowded emergency departments (EDs) create significant challenges for patient management and hospital efficiency. In response, Amiens Picardy University Hospital (APUH) developed the “Prediction of the Patient Pathway in the Emergency Department” (3P-U) model to enhance patient flow management. Objectives: To develop and clinically validate an explainable artificial intelligence (XAI) model for hospital admission predictions, using structured triage data, and demonstrate its real-world applicability in the ED setting. Methods: Our retrospective, single-center study involved 351,019 patients consulting in APUH’s EDs between 2015 and 2018. Various models (including a cross-validation artificial neural network (ANN), a k-nearest neighbors (KNN) model, a logistic regression (LR) model, and a random forest (RF) model) were trained and assessed for performance with regard to the area under the receiver operating characteristic curve (AUROC). The best model was validated internally with a test set, and the F1 score was used to determine the best threshold for recall, precision, and accuracy. XAI techniques, such as Shapley additive explanations (SHAP) and partial dependence plots (PDP) were employed, and the clinical explanations were evaluated by emergency physicians. Results: The ANN gave the best performance during the training stage, with an AUROC of 83.1% (SD: 0.2%) for the test set; it surpassed the RF (AUROC: 71.6%, SD: 0.1%), KNN (AUROC: 67.2%, SD: 0.2%), and LR (AUROC: 71.5%, SD: 0.2%) models. In an internal validation, the ANN’s AUROC was 83.2%. The best F1 score (0.67) determined that 0.35 was the optimal threshold; the corresponding recall, precision, and accuracy were 75.7%, 59.7%, and 75.3%, respectively. The SHAP and PDP XAI techniques (as assessed by emergency physicians) highlighted patient age, heart rate, and presentation with multiple injuries as the features that most specifically influenced the admission from the ED to a hospital ward. These insights are being used in bed allocation and patient prioritization, directly improving ED operations. Conclusions: The 3P-U model demonstrates practical utility by reducing ED crowding and enhancing decision-making processes at APUH. Its transparency and physician validation foster trust, facilitating its adoption in clinical practice and offering a replicable framework for other hospitals to optimize patient flow.

  • Research Article
  • 10.1038/s41598-025-10990-3
Hydraulic Performance Modeling of Inclined Double Cutoff Walls Beneath Hydraulic Structures Using Optimized Ensemble Machine Learning.
  • Jul 29, 2025
  • Scientific reports
  • Mohamed Kamel Elshaarawy + 2 more

This study investigates the effectiveness of inclined double cutoff walls installed beneath hydraulic structures by employing five machine learning models: Random Forest(RF), Adaptive Boosting(AdaBoost), eXtreme Gradient Boosting(XGBoost), Light Gradient Boosting Machine(LightGBM), and Categorical Boosting (CatBoost). A comprehensive dataset of 630 samples was gathered from previous studies, including key input variables such as the relative distance between the cutoff wall and the structure's apron width (L/B), the inclination angle ratio between downstream and upstream cutoffs (θ2/θ1), the depth ratio of downstream to upstream cutoff walls (d2/d1), and the relative downstream cutoff depth to the permeable layer depth (d2/D). Outputs considered were the relative uplift force (U/Uo), the relative exit hydraulic gradient (iR/iRo), and the relative seepage discharge per unit structure length (q/qo). The dataset was split with a 70:30 ratio for training and testing. Hyperparameter optimization was conducted using Bayesian Optimization (BO) coupled with five-fold cross-validation to enhance model performance. Results showed that the CatBoost model demonstrated superior performance over other models, consistently yielding high R2 values, specifically surpassing 0.95, 0.93, and 0.97 for U/Uo, iR/iRo, and q/qo, respectively, along with low RMSE scores below 0.022, 0.089, and 0.019 for the same variables. A feature importance analysis is conducted using SHapley Additive exPlanations(SHAP) and Partial Dependence Plot (PDP). The analysis revealed that L/B was the most influential predictor for U/Uo and iR/iRo, while d2/D played a crucial role in determining q/qo. Moreover, PDPs illustrated a positive linear relationship between L/B and U/Uo, a V-shaped impact of d2/d1 on iR/iRo and q/qo, and complex nonlinear interactions for θ2/θ1 across all target variables. Furthermore, an interactive Graphical User Interface(GUI) was developed, enabling engineers to efficiently predict output variables and apply model insights in practical scenarios.

  • Research Article
  • 10.1186/s12942-025-00404-y
Ecological epidemiology insights into clonorchiosis endemicity in Guangxi, China and Vietnam: a comprehensive machine learning analysis
  • Jul 28, 2025
  • International Journal of Health Geographics
  • Jin-Xin Zheng + 6 more

BackgroundClonorchis sinensis, the liver fluke responsible for clonorchiosis, presents a persistent public health burden in Guangxi (Southern China) and Vietnam. Its transmission is influenced by a complex interplay of ecological, climatic, and socio-cultural factors.MethodsWe compiled infection occurrence data from systematic literature reviews and national surveys conducted between 2000 and 2018. Environmental and climatic predictors were obtained from long-term raster datasets. Machine learning models, including logistic regression and tree-based ensemble methods, were used to assess associations between predictor variables and C. sinensis presence. Partial dependence plots were employed to refine predictor selection and explore marginal effects.ResultsRaw freshwater fish consumption was identified as the most influential predictor. In Guangxi, 54.9% of counties reported raw fish consumption, compared to 31.7% in Vietnam. Logistic regression achieved the highest predictive accuracy (AUC = 0.941). Climatic comparisons showed that Vietnam had a higher annual mean temperature (Bio1: 23.37 °C vs. 20.86 °C), greater temperature seasonality (Bio4: 609.33 vs. 464.92), and higher annual precipitation (Bio12: 1731.64 mm vs. 1607.56 mm) than Guangxi, contributing to spatial differences in endemicity. High-risk zones were concentrated along the China–Vietnam border, suggesting the need for geographically targeted interventions.ConclusionThe findings underscore the combined influence of ecological and behavioral factors on C. sinensis transmission. The predictive modeling framework offers valuable insights for surveillance planning and cross-border disease control, reinforcing the role of ecological epidemiology in guiding parasitic disease prevention strategies.Graphical Supplementary InformationThe online version contains supplementary material available at 10.1186/s12942-025-00404-y.

  • Research Article
  • 10.1186/s12872-025-04927-x
Association between glucose-to-albumin ratio and ischemic stroke risk in patients with coronary heart disease: a machine learning-based predictive model analysis.
  • Jul 25, 2025
  • BMC cardiovascular disorders
  • Ling Hou + 2 more

Coronary heart disease (CHD) and ischemic stroke (IS) share several pathophysiological mechanisms and risk factors, such as hypertension, hyperlipidemia, and diabetes. Investigating novel markers, such as the glucose-to-albumin ratio (GAR), for predicting the risk of IS in CHD patients holds significant clinical value. We retrospectively enrolled 1,885 patients diagnosed with CHD who were treated at our hospital from January 1, 2022, to July 31, 2024. Feature selection was conducted using the Boruta algorithm, and a multilayer perceptron (MLP) model was employed to predict the risk of IS in CHD patients. The performance of the model was evaluated using ROC curves and calibration plots. SHAP values and partial dependence plots (PDP) were used to interpret the model's predictions. The study showed that patients in the IS group were older and had significantly higher rates of hypertension and diabetes compared to those without AIS. Additionally, the AIS group had a higher prevalence of triple-vessel disease and right coronary artery lesions. GAR was significantly elevated in the IS group compared to the non-IS group. Key features identified by the Boruta algorithm included GAR, hyperlipidemia, and a history of hypertension. SHAP analysis indicated that GAR was significantly associated with IS risk, and PDP analysis further confirmed GAR as an independent predictor of IS. GAR is a significant independent predictor of IS risk in CHD patients, with elevated GAR levels being strongly associated with an increased risk of IS.

  • Research Article
  • 10.3390/machines13080640
Explainable Data Mining Framework of Identifying Root Causes of Rocket Engine Anomalies Based on Knowledge and Physics-Informed Feature Selection
  • Jul 23, 2025
  • Machines
  • Xiaopu Zhang + 2 more

Liquid rocket engines occasionally experience abnormal phenomena with unclear mechanisms, causing difficulty in design improvements. To address the above issue, a data mining method that combines ante hoc explainability, post hoc explainability, and prediction accuracy is proposed. For ante hoc explainability, a feature selection method driven by data, models, and domain knowledge is established. Global sensitivity analysis of a physical model combined with expert knowledge and data correlation is utilized to establish the correlations between different types of parameters. Then a two-stage optimization approach is proposed to obtain the best feature subset and train the prediction model. For the post hoc explainability, the partial dependence plot (PDP) and SHapley Additive exPlanations (SHAP) analysis are used to discover complex patterns between input features and the dependent variable. The effectiveness of the hybrid feature selection method and its applicability under different noise combinations are validated using synthesized data from a high-fidelity simulation model of a pressurization system. Then the analysis of the causes of a large vibration phenomenon in an active engine shows that the prediction model has good accuracy, and the feature selection results have a clear mechanism and align with domain knowledge, providing both accuracy and interpretability. The proposed method shows significant potential for data mining in complex aerospace products.

  • Research Article
  • 10.3126/joeis.v4i1.81570
Ensemble Machine Learning and Model Interpretability for Leakage Prediction in Hydraulic Tunnels
  • Jul 21, 2025
  • Journal of Engineering Issues and Solutions
  • Biplove Ghimire

Drill-and-blast tunnel construction in the Himalayan region often encounters complex geological and hydrogeological conditions, leading to significant water leakage that impacts project cost and stability. This study aims to enhance leakage prediction accuracy using ensemble machine learning techniques. Initial leakage estimates were made using Panthi’s semi-empirical approach for the Nilgiri-II Hydropower Project. A dataset comprising rock mass quality, topography, and permeability features was used to train four ensemble models: Bagging, Boosting (XGBoost), Voting, and Stacking. Among these, Bagging outperformed others with an R2 of 0.99, followed by Voting and Stacking (both R2 = 0.97). Partial Dependence Plots (PDP) and Individual Conditional Expectation (ICE) plots were used to interpret model predictions and identify key influencing features such as hydrostatic head (Hstatic), distance to the valley side (D), and joint parameters. These results demonstrate that ensemble learning, particularly bagging, is highly effective in modeling water leakage in challenging Himalayan tunnel environments.

  • Research Article
  • 10.1038/s41598-025-11601-x
Predicting the mechanical performance of industrial waste incorporated sustainable concrete using hybrid machine learning modeling and parametric analyses
  • Jul 20, 2025
  • Scientific Reports
  • Md Alhaz Uddin + 8 more

The construction sector is proactively working to minimize the environmental impact of cement manufacturing by adopting alternative cementitious substances and cutting carbon emissions tied to concrete. This study investigates the viability of using waste industrial materials as a replacement of cement in concrete mixes. The primary goal is to predict the compressive strength of waste-incorporated concrete by evaluating the effects of materials such as cement, fly ash (FA), silica fume (SF), ground granulated blast furnace slag (GGBFS), metakaolin (MK), water usage, aggregate levels, and superplasticizer dosages. A total of 441 data entries were sourced from various publications. Multiple machine learning techniques, such as light gradient boosting (LGB), extreme gradient boosting (XGB), and decision trees (DT), along with hybrid approaches like XGB-LGB and XGB-DT, were utilized to study how these variables influence compressive strength. The dataset was partitioned into training and testing, and statistical tools were employed to assess the correlation between input variables and strength. Model accuracy was gauged using metrics such as mean absolute percentage error (MAPE), root mean square error (RMSE), and the coefficient of determination (R2). Among the models, the XGB and DT approach delivered the highest precision, with an R2 of 0.928 in the training stage. Among hybrid models, XGB-DT exhibited a balanced performance having R2 value of 0.907 and 0.785 for training and testing phase. Additionally, SHAP (SHapley Additive exPlanations) and partial dependence plots (PDP) were employed to pinpoint the optimal ranges for each variable’s contribution to the improvement of compressive strength. SHAP and PDP analyses identified coarse aggregate, superplasticizers, water and cement content have high influence on model’s output. Additionally, 150–200 kg/m3 of GGBFS as key factors for optimizing compressive strength. The study concludes that the hybrid models along with the single models, can effectively forecast the compressive strength of concrete incorporating industrial byproducts, assisting the construction industry in efficiently evaluating material properties and understanding the influence of various input factors.

  • Research Article
  • 10.1007/s10462-025-11165-2
Artificial intelligence in environmental and Earth system sciences: explainability and trustworthiness
  • Jul 19, 2025
  • Artificial Intelligence Review
  • Josepha Schiller + 2 more

Abstract Explainable artificial intelligence (XAI) methods have recently emerged to gain insights into complex machine learning models. XAI can be promising for environmental and Earth system science because high-stakes decision-making for management and planning requires justification based on evidence and systems understanding. However, an overview of XAI applications and trust in AI in environmental and Earth system science is still missing. To close this gap, we reviewed 575 articles. XAI applications are popular in various domains, including ecology, engineering, geology, remote sensing, water resources, meteorology, atmospheric sciences, geochemistry, and geophysics. XAI applications focused primarily on understanding and predicting anthropogenic changes in geospatial patterns and impacts on human society and natural resources, especially biological species distributions, vegetation, air quality, transportation, and climate-water related topics, including risk and management. Among XAI methods, the SHAP and Shapley methods were the most popular (135 articles), followed by feature importance (27), partial dependence plots (22), LIME (21), and saliency maps (15). Although XAI methods are often argued to increase trust in model predictions, only seven studies (1.2%) addressed trustworthiness as a core research objective. This gap is critical because understanding the relationship between explainability and trust is lacking. While XAI applications continue to grow, they do not necessarily enhance trust. Hence, more studies on how to strengthen trust in AI applications are critically needed. Finally, this review underlines the recommendation of developing a “human-centered” XAI framework that incorporates the distinct views and needs of multiple stakeholder groups to enable trustworthy decision-making.

  • Research Article
  • 10.18240/ijo.2025.07.04
Associations between organophosphorus pesticides exposure and age-related macular degeneration risk in U.S. adults: analysis from interpretable machine learning approaches.
  • Jul 18, 2025
  • International journal of ophthalmology
  • Yu-Xin Jiang + 2 more

To investigate the associations between urinary dialkyl phosphate (DAP) metabolites of organophosphorus pesticides (OPPs) exposure and age-related macular degeneration (AMD) risk. Participants were drawn from the National Health and Nutrition Examination Survey (NHANES) between 2005 and 2008. Urinary DAP metabolites were used to construct a machine learning (ML) model for AMD prediction. Several interpretability pipelines, including permutation feature importance (PFI), partial dependence plot (PDP), and SHapley Additive exPlanations (SHAP) analyses were employed to analyze the influence from exposure features to prediction outcomes. A total of 1845 participants were included and 137 were diagnosed with AMD. Receiver operating characteristic curve (ROC) analysis evaluated Random Forests (RF) as the best ML model with its optimal predictive performance among eleven models. PFI and SHAP analyses illustrated that DAP metabolites were of significant contribution weights in AMD risk prediction, higher than most of the socio-demographic covariates. Shapley values and waterfall plots of randomly selected AMD individuals emphasized the predictive capacity of ML with high accuracy and sensitivity in each case. The relationships and interactions visualized by graphical plots and supported by statistical measures demonstrated the indispensable impacts from six DAP metabolites to the prediction of AMD risk. Urinary DAP metabolites of OPPs exposure are associated with AMD risk and ML algorithms show the excellent generalizability and differentiability in the course of AMD risk prediction.

  • Research Article
  • 10.1038/s41598-025-11239-9
Predicting the compressive strength of concrete incorporating waste powders exposed to elevated temperatures utilizing machine learning
  • Jul 12, 2025
  • Scientific Reports
  • Islam N Fathy + 4 more

The addition of powders from waste construction materials as partial cement substitute in concrete represents a significant step toward green concrete construction. High temperatures have a substantial influence on concrete strength, resulting in a reduction in mechanical properties. The prediction of the impacts of waste powders on concrete strength is an important topic in sustainable construction. Such models are needed to understand the complex interactions between waste materials’ powders and concrete strength. In this study, three machine learning approaches, extreme gradient boosting (XGBoost), random forest (RF), and M5P, were used for constructing the prediction model for the impact of elevated temperatures on the compressive strength of concrete modified by marble and granite construction waste powders as partial cement replacements in concrete. Dataset of 324 tested cubic specimens with four input variables, waste granite powder dose (GWP), waste marble powder (MWP), temperature (T), and duration (D) were chosen for developing the prediction models. The output was the concrete compressive strength (CS). MWP and GWP ranged between 0 and 9%, temperatures were ranged between 25 °C and 800 °C, and duration up to 2 h. Hyperparameters in the RF and XGB models were optimized using grid search. K-fold cross-validation and several statistical measures, including R2MAPE, RMSE, and MAE, were utilized to validate and check the accuracy of the proposed models. The developed models were evaluated against experimental data and previously established models. The XGB model demonstrated the highest R2 of 0.9989, alongside the lowest prediction errors: MAE of 0.1351 MPa, RMSE of 0.1842 MPa, and MAPE of 0.48%. The results showed that the XGB prediction model for the concrete compressive strength outperformed the other proposed models. The SHAP analysis, Individual Conditional Expectation (ICE), and Partial Dependence Plots (PDP) revealed that GWP and MWP positively influence the compressive strength, while the temperature exerts the most negative influence on predicting the compressive strength. Finally, a graphical user interface (GUI) for the compressive strength of concrete containing GWP and MWP subjected to elevated temperatures has been created, which may be of considerable assistance, guidance, and efficiency in research and construction industry contexts.

  • Research Article
  • 10.1080/14703297.2025.2532050
Interpretable machine learning for academic performance prediction: A SHAP-based analysis of key influencing factors
  • Jul 12, 2025
  • Innovations in Education and Teaching International
  • Yiming Guan + 2 more

ABSTRACT This study employs machine learning approaches to predict the final exam scores of vocational undergraduate students and analyse critical factors influencing their academic performance. Using a multidimensional feature dataset, Ridge Regression was set as a baseline model, while four mainstream machine learning models – Random Forest, XGBoost, Support Vector Machine and Neural Network – were utilised for predictive modelling, with Random Forest achieving the best performance. SHapley Additive exPlanations (SHAP) was applied to interpret global and local feature contributions, indicating monthly exam scores, admission scores and self-study time as the most influential predictors, whereas demographic features were comparatively less significant. Furthermore, Partial Dependence Plots (PDP) and Kernel Density Estimation (KDE) analyses were conducted to explore feature interactions and differences between high- and low-achieving students, offering practical insights for vocational institutions to implement precise interventions focusing on key predictive factors.

  • Research Article
  • 10.3390/biomedicines13071706
An Interpretable Machine Learning Model Based on Inflammatory-Nutritional Biomarkers for Predicting Metachronous Liver Metastases After Colorectal Cancer Surgery.
  • Jul 12, 2025
  • Biomedicines
  • Hao Zhu + 3 more

Objective: Tumor progression is regulated by systemic immune status, nutritional metabolism, and the inflammatory microenvironment. This study aims to investigate inflammatory-nutritional biomarkers associated with metachronous liver metastasis (MLM) in colorectal cancer (CRC) and develop a machine learning model for accurate prediction. Methods: This study enrolled 680 patients with CRC who underwent curative resection, randomly allocated into a training set (n = 477) and a validation set (n = 203) in a 7:3 ratio. Feature selection was performed using Boruta and Lasso algorithms, identifying nine core prognostic factors through variable intersection. Seven machine learning (ML) models were constructed using the training set, with the optimal predictive model selected based on comprehensive evaluation metrics. An interactive visualization tool was developed to interpret the dynamic impact of key features on individual predictions. The partial dependence plots (PDPs) revealed a potential dose-response relationship between inflammatory-nutritional markers and MLM risk. Results: Among 680 patients with CRC, the cumulative incidence of MLM at 6 months postoperatively was 39.1%. Multimodal feature selection identified nine key predictors, including the N stage, vascular invasion, carcinoembryonic antigen (CEA), systemic immune-inflammation index (SII), albumin-bilirubin index (ALBI), differentiation grade, prognostic nutritional index (PNI), fatty liver, and T stage. The gradient boosting machine (GBM) demonstrated the best overall performance (AUROC: 0.916, sensitivity: 0.772, specificity: 0.871). The generalized additive model (GAM)-fitted SHAP analysis established, for the first time, risk thresholds for four continuous variables (CEA > 8.14 μg/L, PNI < 44.46, SII > 856.36, ALBI > -2.67), confirming their significant association with MLM development. Conclusions: This study developed a GBM model incorporating inflammatory-nutritional biomarkers and clinical features to accurately predict MLM in colorectal cancer. Integrated with dynamic visualization tools, the model enables real-time risk stratification via a freely accessible web calculator, guiding individualized surveillance planning and optimizing clinical decision-making for precision postoperative care.

  • Research Article
  • 10.3389/fpubh.2025.1602566
Developing an interpretable machine learning predictive model of chronic obstructive pulmonary disease by serum PFAS concentration
  • Jul 10, 2025
  • Frontiers in Public Health
  • Xiaomei Shao + 4 more

BackgroundChronic obstructive pulmonary disease (COPD) is a leading cause of morbidity and mortality worldwide, with limited early detection strategies. While previous studies have examined the relationship between per- and polyfluoroalkyl substances (PFAS) and COPD, limited research has applied interpretable machine learning (ML) techniques to this association.MethodsWe investigated the association between PFAS exposure and COPD risk in 4,450 National Health and Nutrition Examination Survey (NHANES) participants from 2013 to 2018. After excluding missing covariates and extreme PFAS values and applying K-nearest neighbors (KNN) imputation, nine ML models, including CatBoost, were built and evaluated using metrics like accuracy, area under the curve (AUC), sensitivity, and specificity. The best-performing model was further analyzed using partial dependence plots (PDP) and SHapley additive exPlanations (SHAP) analysis. To enhance clinical applicability, the final model was deployed as a publicly accessible web-based risk calculator.ResultsCatBoost emerged as the best model, achieving an accuracy of 84%, AUC of 0.89, sensitivity of 81%, and specificity of 84%. PDP revealed that higher perfluorooctane sulfonic acid (PFOS) and perfluoroundecanoic acid (PFUA) levels were associated with reduced COPD risk, whereas perfluorooctanoic acid (PFOA) and 2-(N-Methyl-perfluorooctane sulfonamido) acetic acid (MPAH) showed positive associations with COPD. perfluorononanoic acid (PFNA), perfluorodecanoic acid (PFDE), and perfluorohexane sulfonic acid (PFHxS) demonstrated mixed or non-linear effects. SHAP analysis provided insights into individual predictions and overall variable contributions, clarifying the complex PFAS-COPD relationship. The deployed web-based calculator enables interactive prediction and risk interpretation, supporting potential public health applications.ConclusionCatBoost identified PFOS and PFUA as protective factors against COPD, while PFOA and MPAH increased risk of COPD. These findings emphasize the need for stricter PFAS regulation and highlight the potential of machine learning in guiding prevention strategies.

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • .
  • .
  • .
  • 10
  • 1
  • 2
  • 3
  • 4
  • 5

Popular topics

  • Latest Artificial Intelligence papers
  • Latest Nursing papers
  • Latest Psychology Research papers
  • Latest Sociology Research papers
  • Latest Business Research papers
  • Latest Marketing Research papers
  • Latest Social Research papers
  • Latest Education Research papers
  • Latest Accounting Research papers
  • Latest Mental Health papers
  • Latest Economics papers
  • Latest Education Research papers
  • Latest Climate Change Research papers
  • Latest Mathematics Research papers

Most cited papers

  • Most cited Artificial Intelligence papers
  • Most cited Nursing papers
  • Most cited Psychology Research papers
  • Most cited Sociology Research papers
  • Most cited Business Research papers
  • Most cited Marketing Research papers
  • Most cited Social Research papers
  • Most cited Education Research papers
  • Most cited Accounting Research papers
  • Most cited Mental Health papers
  • Most cited Economics papers
  • Most cited Education Research papers
  • Most cited Climate Change Research papers
  • Most cited Mathematics Research papers

Latest papers from journals

  • Scientific Reports latest papers
  • PLOS ONE latest papers
  • Journal of Clinical Oncology latest papers
  • Nature Communications latest papers
  • BMC Geriatrics latest papers
  • Science of The Total Environment latest papers
  • Medical Physics latest papers
  • Cureus latest papers
  • Cancer Research latest papers
  • Chemosphere latest papers
  • International Journal of Advanced Research in Science latest papers
  • Communication and Technology latest papers

Latest papers from institutions

  • Latest research from French National Centre for Scientific Research
  • Latest research from Chinese Academy of Sciences
  • Latest research from Harvard University
  • Latest research from University of Toronto
  • Latest research from University of Michigan
  • Latest research from University College London
  • Latest research from Stanford University
  • Latest research from The University of Tokyo
  • Latest research from Johns Hopkins University
  • Latest research from University of Washington
  • Latest research from University of Oxford
  • Latest research from University of Cambridge

Popular Collections

  • Research on Reduced Inequalities
  • Research on No Poverty
  • Research on Gender Equality
  • Research on Peace Justice & Strong Institutions
  • Research on Affordable & Clean Energy
  • Research on Quality Education
  • Research on Clean Water & Sanitation
  • Research on COVID-19
  • Research on Monkeypox
  • Research on Medical Specialties
  • Research on Climate Justice
Discovery logo
FacebookTwitterLinkedinInstagram

Download the FREE App

  • Play store Link
  • App store Link
  • Scan QR code to download FREE App

    Scan to download FREE App

  • Google PlayApp Store
FacebookTwitterTwitterInstagram
  • Universities & Institutions
  • Publishers
  • R Discovery PrimeNew
  • Ask R Discovery
  • Blog
  • Accessibility
  • Topics
  • Journals
  • Open Access Papers
  • Year-wise Publications
  • Recently published papers
  • Pre prints
  • Questions
  • FAQs
  • Contact us
Lead the way for us

Your insights are needed to transform us into a better research content provider for researchers.

Share your feedback here.

FacebookTwitterLinkedinInstagram
Cactus Communications logo

Copyright 2025 Cactus Communications. All rights reserved.

Privacy PolicyCookies PolicyTerms of UseCareers