Research on Control Factors and Parameter Optimization of Surfactant Flooding in Low-Permeability Reservoirs Using Random Forest Algorithm

  • TL;DR
  • Abstract
  • Literature Map
  • Similar Papers
TL;DR

This study evaluates surfactant flooding in low-permeability reservoirs, comparing models including Random Forest, which achieved high predictive accuracy (R2=0.9428). Using SHAP and sensitivity analyses, it identified key factors influencing recovery, with optimized parameters predicting a 45.61% recovery factor, 6.57% higher than experimental results, demonstrating machine learning's effectiveness in parameter optimization for enhanced oil recovery.

Abstract
Translate article icon Translate Article Star icon

As oil and gas development increasingly targets low and ultra-low permeability reservoirs, conventional recovery techniques often prove insufficient for mobilizing residual oil. Surfactant flooding, a key chemical enhanced oil recovery (EOR) technology, thus requires careful system optimization and mechanistic investigation. This study focuses on low-permeability reservoirs in the Changqing Oilfield, evaluating three surfactant systems—YHS-Z1 (a 7:3 mass ratio blend of hydroxypropyl sulfobetaine and cocamide),YHS-Z2 (a polyether carboxylate, a nonionic-anionic composite) and a middle-phase microemulsion system (Heavy alkylbenzene sulfonate and hydroxysulfobetaine were combined with a mass ratio of 7:3)—through a series of experiments including interfacial tension measurement, contact angle analysis, static and dynamic oil displacement tests, as well as emulsion transport/retention index assessments, to comprehensively characterize their oil displacement properties. Based on the experimental data, this study constructed four classical regression models: Ridge Regression, Random Forest (RF), Gradient Boosting Regression (GBR), and Support Vector Regression (SVR), and conducted a comparative analysis of their predictive performance. The results demonstrate that the Random Forest (RF) model achieved the optimal prediction performance, with a Mean Absolute Error (MAE) of 1.8245, a Mean Absolute Percentage Error (MAPE) of 4.78%, and a coefficient of determination (R2) of 0.9428 on the training set. Further analysis using the SHapley Additive exPlanations (SHAP) algorithm revealed that the retention index is the primary global factor (accounting for 49.79% of the variance), while significant intergroup differences exist in the primary factors across different surfactant systems. Concurrently, single-factor and multi-factor sensitivity analyses were conducted to elucidate synergistic effects and threshold behaviors among parameters. The optimal parameter combination, identified via a random search method, achieved a predicted recovery factor of 45.61%, representing a 6.57% improvement over the highest experimental value. This study demonstrates that machine learning methods can effectively identify the dominant factors in oil displacement and enable synergistic parameter optimization, thereby providing a theoretical foundation for the efficient development of surfactant flooding in low-permeability reservoirs.

Similar Papers
  • Research Article
  • 10.1038/s41598-025-14372-7
Enhancing software effort estimation with random forest tuning and adaptive decision strategies.
  • Sep 30, 2025
  • Scientific reports
  • Priya Varshini A G + 2 more

Software Effort estimation (SEE) is a vital task for project management as it is essential for resource allocation and project planning. Numerous algorithms have been investigated for forecasting software effort, yet achieving precise predictions remains a significant hurdle in the software industry. To achieve optimal accuracy, machine learning algorithms are employed. Remarkably, Random Forest (RF) algorithm produced better accuracy when compared with various algorithms. In this paper, the prediction is extended by increasing the number of trees and Improved Random Forest (IRF) is implemented by including three decision techniques such as residual analysis, partial dependence plots and feature engineering to improve prediction accuracy. To make improved random forest to be adaptive, it is further extended in this paper by integrating three techniques such as: Bayesian Optimization with Deep Kernel Learning (BO-DKL) to adaptively set hyperparameters, Time-Series Residual Analysis to detect autocorrelation patterns among model error, and Explainable AI techniques Shapley Additive Explanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME) to improve feature interpretability. This Improved Adaptive Random Forest (IARF) mutually contributes to a comprehensive evaluation and improvement of accuracy in prediction. Metrics used for evaluation are Mean Absolute Error (MAE), Root Mean Square Error (RMSE), R-Squared, Mean Absolute Percentage Error (MAPE), Mean Absolute Scaled Error (MASE) and Prediction Interval Coverage Probability (PICP). Overall, the improved adaptive RF model had an average improvement ratio of 18.5% on MAE, 20.3% on RMSE, 3.8% on R2, 5.4% on MAPE, 7% reduction in MASE and a 3-5% improvement in PICP across all data sets compared to the Random Forest model, with much improved prediction accuracy. These findings validate that the combination of adaptive learning methods and explainability-based adjustments considerably improves accuracy of software effort estimation models and facilitates more trustworthy decision-making in software development projects.

  • Research Article
  • Cite Count Icon 3
  • 10.1016/j.jenvman.2025.126521
Prediction of hydrogen and methane yields from gasification of leather waste using machine learning and explainable AI: An original dataset.
  • Sep 1, 2025
  • Journal of environmental management
  • Pınar Cihan + 4 more

Prediction of hydrogen and methane yields from gasification of leather waste using machine learning and explainable AI: An original dataset.

  • Conference Article
  • Cite Count Icon 79
  • 10.2118/114168-ms
The Characteristic Flow Behavior of Low-Permeability Reservoir Systems
  • Feb 10, 2008
  • T A Blasingame

This paper considers the mechanisms and characteristic flow patterns of low permeability reservoir systems. In this paper we focus on the issue of low permeability in conjunction with reservoir heterogeneity (as these often go hand in hand). Generally speaking, we focus on the single-phase gas flow case as this is most relevant — and we avoid concerns related to multiphase flow. Low permeability reservoir systems exhibit unique flow behavior for the following reasons: Low permeability (which yields poor utilization of reservoir pressure), this is caused in part by: –Depositional issues: very small grains, mixed with detrital muds (clays).–Diagenetic issues: clay precipitation, massive cementation, pressure compaction, etc. Reservoir heterogeneity — dictated by deposition and post-deposition (diagenetic) events, including: –Vertical heterogeneity: layering, laminae, etc.–Lateral heterogeneity: medium to large scale geologic features (e.g., turbidite deposition, faults, etc.).–Differential diagenesis, including hydrocarbon generation and migration. These characteristics lead us to the relatively simple observation that low permeability reservoirs are simply poor conductors offluids. As a matter of background, this work discusses the issues relevant to the origin of low (and ultra-low) permeability reservoirs, but our primary focus is flow at macro- and mega-scales (as would be observed at a well). An obvious comment at this point is that the reservoir permeability and the reservoir heterogeneity are fixed constants that we can not change. While true, we can change our mechanism for accessing the reservoir (i.e., the well) and we can change our development strategy to ensure optimal performance and recovery of a particular reservoir. As for changing our access to the reservoir, we can utilize hydraulic fracture stimulation techniques to create a conductive pathway into the reservoir from the well. This is and will be implicit in the continued development of low and ultra-low permeability reservoirs — regardless of the well type (vertical or horizontal). In this work, our emphasis is to consider the relatively simple case of a single vertical well with a hydraulic fracture and the resulting flow behavior that this type of well will experience. It is our contention that the elliptical flow regime dominates reservoir performance in low/ultra-low permeability reservoirs, and we apply both analytical and numerical solutions to a typical field case to illustrate the validity of the elliptical flow regime.

  • Research Article
  • 10.2196/75020
Machine Learning and Shapley Additive Explanations Value Integration for Predicting the Prognostic of Anti-N-Methyl-D-Aspartate Receptor Encephalitis: Model Development and Evaluation Study
  • Sep 22, 2025
  • JMIR Medical Informatics
  • Jia Wang + 4 more

BackgroundAnti-N-methyl-D-aspartate receptor (NMDAR) encephalitis is a rare disease with no accurate prognostic tools to predict the prognosis of patients.ObjectiveThis study aims to develop an interpretable machine learning model using real-world clinical data to guide personalized therapeutic strategies.MethodsThis retrospective cohort study analyzed 140 patients with NMDAR encephalitis treated at the Third Affiliated Hospital of Sun Yat-sen University (2015‐2024). Feature selection was done using recursive feature elimination. The model was constructed by 3 machine learning algorithms: decision tree, random forest (RF), and extreme gradient boosting. Mean squared error, root-mean-squared error, R² (coefficient of determination), mean absolute error, and mean absolute percentage error were used to evaluate the model performance. Finally, the optimal model was interpreted via Shapley Additive Explanations (SHAP) and deployed as a web application using the Flask framework.ResultsThe median age of patients with anti-NMDAR encephalitis was 23 (IQR 18-31.8) years. The median Clinical Evaluation Scale for Autoimmune Encephalitis score at acute onset was 11 (IQR 6-16). After preprocessing, 20 features, including 4 demographic characteristics, 3 clinical characteristics, 11 laboratory parameters, and 2 neuroimaging characteristics, were selected. The RF demonstrated superior accuracy in predicting the prognosis (mean squared error=11.01; root-mean-squared error=3.32; R²=0.71; mean absolute error=2.49; mean absolute percentage error=0.48). SHAP analysis identified admission to the intensive care unit (mean |SHAP value|=1.65), initial symptoms-memory deficits (0.69), and uric acid (0.53) as the most important prognostic predictors.ConclusionsWe developed and validated an interpretable RF-based prognostic model for NMDAR encephalitis. The web-deployed tool enables real-time risk stratification, facilitating clinical decision-making and personalized therapeutic interventions for clinicians.

  • Research Article
  • Cite Count Icon 18
  • 10.1016/j.psj.2024.104458
Predicting egg production rate and egg weight of broiler breeders based on machine learning and Shapley additive explanations
  • Oct 29, 2024
  • Poultry Science
  • Hengyi Ji + 2 more

Predicting egg production rate and egg weight of broiler breeders based on machine learning and Shapley additive explanations

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 116
  • 10.1038/s41598-024-55243-x
A comparative study of 11 non-linear regression models highlighting autoencoder, DBN, and SVR, enhanced by SHAP importance analysis in soybean branching prediction.
  • Mar 11, 2024
  • Scientific Reports
  • Wei Zhou + 2 more

To explore a robust tool for advancing digital breeding practices through an artificial intelligence-driven phenotype prediction expert system, we undertook a thorough analysis of 11 non-linear regression models. Our investigation specifically emphasized the significance of Support Vector Regression (SVR) and SHapley Additive exPlanations (SHAP) in predicting soybean branching. By using branching data (phenotype) of 1918 soybean accessions and 42k SNP (Single Nucleotide Polymorphism) polymorphic data (genotype), this study systematically compared 11 non-linear regression AI models, including four deep learning models (DBN (deep belief network) regression, ANN (artificial neural network) regression, Autoencoders regression, and MLP (multilayer perceptron) regression) and seven machine learning models (e.g., SVR (support vector regression), XGBoost (eXtreme Gradient Boosting) regression, Random Forest regression, LightGBM regression, GPs (Gaussian processes) regression, Decision Tree regression, and Polynomial regression). After being evaluated by four valuation metrics: R2 (R-squared), MAE (Mean Absolute Error), MSE (Mean Squared Error), and MAPE (Mean Absolute Percentage Error), it was found that the SVR, Polynomial Regression, DBN, and Autoencoder outperformed other models and could obtain a better prediction accuracy when they were used for phenotype prediction. In the assessment of deep learning approaches, we exemplified the SVR model, conducting analyses on feature importance and gene ontology (GO) enrichment to provide comprehensive support. After comprehensively comparing four feature importance algorithms, no notable distinction was observed in the feature importance ranking scores across the four algorithms, namely Variable Ranking, Permutation, SHAP, and Correlation Matrix, but the SHAP value could provide rich information on genes with negative contributions, and SHAP importance was chosen for feature selection. The results of this study offer valuable insights into AI-mediated plant breeding, addressing challenges faced by traditional breeding programs. The method developed has broad applicability in phenotype prediction, minor QTL (quantitative trait loci) mining, and plant smart-breeding systems, contributing significantly to the advancement of AI-based breeding practices and transitioning from experience-based to data-based breeding.

  • Research Article
  • Cite Count Icon 2
  • 10.46481/jnsps.2024.2079
Wind speed prediction in some major cities in Africa using Linear Regression and Random Forest algorithms
  • Sep 8, 2024
  • Journal of the Nigerian Society of Physical Sciences
  • Timothy Kayode Samson + 1 more

Globally, wind energy if properly harnessed, could serve as a source of energy generation in Africa. This study compared the performance of two Machine Learning (ML) algorithms (Linear regression and Random Forest) in predicting wind speed in five major cities in Africa (Yaoundé, Pretoria, Nairobi, Cairo and Abuja). Wind data were collected between January 1, 2000, and December 31, 2022, using the Solar Radiation Data Archive. The data preprocessing was carried out with 80% of the data used for training and 20% for validation. The performance of these ML algorithms was evaluated using Mean Square Error (MSE), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE) and coefficient of determination (R2). The result shows that Nairobi (3.814795 m/s) closely followed by Cairo (3.606453 m/s) has the highest mean wind speed while Yaoundé (1.090512 m/s) has the lowest. Based on the performance metrics used, the two Machine Learning algorithms were competitive. Still, the Linear Regression (LR) algorithm outperformed the Random Forest Algorithm in predicting wind speed in all the selected major African cities. In Yaoundé (RMSE = 0.3892, MAE= 0.3001, MAPE =0.5030), Pretoria (RMSE=1.2339, MAE=0.9480, MAPE=0.7450) Nairobi (RMSE= 0.4223, MAE =0.6499, MAPE =0.1872), Nairobi (RMSE=0.6499, MAE=0.5171, MAPE =0.1872), Cairo (RMSE =1.0909, MAE =0.8544, MAPE =0.3541) and Abuja (RMSE = 0.70245, MAE =0.5441, MAPE= 0.4515) the Linear regression algorithms was found to outperformed Random Forest Regression. Therefore, the Linear regression algorithm is more reliable in predicting wind speed compared with the Random Forest regression.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 20
  • 10.1155/2022/8089428
Predicting and Investigating the Permeability Coefficient of Soil with Aided Single Machine Learning Algorithm
  • Jan 1, 2022
  • Complexity
  • Van Quan Tran

The permeability coefficient of soils is an essential measure for designing geotechnical construction. The aim of this paper was to select a highest performance and reliable machine learning (ML) model to predict the permeability coefficient of soil and quantify the feature importance on the predicted value of the soil permeability coefficient with aided machine learning‐based SHapley Additive exPlanations (SHAP) and Partial Dependence Plot 1D (PDP 1D). To acquire this purpose, five single ML algorithms including K‐nearest neighbors (KNN), support vector machine (SVM), light gradient boosting machine (LightGBM), random forest (RF), and gradient boosting (GB) are used to build ML models for predicting the permeability coefficient of soils. Performance criteria for ML models include the coefficient of correlation R 2 , root mean square error (RMSE), mean absolute percentage error (MAPE), and mean absolute error (MAE). The best performance and reliable single ML model for predicting the permeability coefficient of soil for the testing dataset is the gradient boosting (GB) model, which has R 2 = 0.971, RMSE = 0.199 × 10 −11 m/s, MAE = 0.161 × 10 −11 m/s, and MAPE = 0.185%. To identify and quantify the feature importance on the permeability coefficient of soil, sensitivity studies using permutation importance, SHapley Additive exPlanations (SHAP), and Partial Dependence Plot 1D (PDP 1D) are performed with the aided best performance and reliable ML model GB. Plasticity index, density > water content, liquid limit, and plastic limit > clay content > void ratio are the order effects on the predicted value of the permeability coefficient. The plasticity index and density of soil are the first priority soil properties to measure when assessing the permeability coefficient of soil.

  • Research Article
  • 10.1016/j.chemosphere.2026.144926
Predicting blood levels of mercury and selenium in Amazonian riverines: A machine learning approach based on questionnaire data.
  • Jun 1, 2026
  • Chemosphere
  • Jonas Carneiro Cruz + 6 more

Predicting blood levels of mercury and selenium in Amazonian riverines: A machine learning approach based on questionnaire data.

  • Conference Article
  • Cite Count Icon 8
  • 10.2118/136904-ms
Performance Analysis and Field Application Result of Polymer Flooding in Low-Permeability Reservoirs in Daqing Oilfield
  • Oct 19, 2010
  • W Fenglan + 5 more

The polymer flooding technology in high and medium permeability reservoirs has been applied commercially in Daqing Olifield since 1996. It has become an important supporting technology for both the stable output of Daqing Oilfield and the development improvement of the mature oilfields. In order to study EOR method in low permeability(less than 100mD) reservoirs, pilot tests of polymer flooding were performed in Daqing oilfield. According to the research results, the low-molecular weight polymer (400800 Dalton) can be continuously injected into low permeability reservoirs under the specified well spacing. Pilot tests of polymer flooding show that oil production was increased from 1.06 tons/d to 3.04 tons/d and water cut was decreased from 96.0% to 89.8% in low permeability reservoirs. Formations with extra-low permeability(less than 10mD) are not flooded effectively in the process of polymer flooding. Production performance of polymer flooding in low permeability reservoirs can be improved by measures of fracturing or separate zone injection with different molecular weight polymers. Comparison and analysis on injection profiles and productivity profiles at different injection polymer parameters showed that the polymer solution with low-molecular weight and relative high concentration was suitable for polymer flooding in low permeability reservoirs. Numerical simulation and pilot results both showed that more than 5% OOIP were obtained by polymer flooding over that of water flooding in low permeability reservoirs in Daqing oilfield.

  • PDF Download Icon
  • Research Article
  • 10.3390/app16010311
Prediction of Mean Fragmentation Size in Open-Pit Mine Blasting Operations Using Histogram-Based Gradient Boosting and Grey Wolf Optimization Approach
  • Dec 28, 2025
  • Applied Sciences
  • Madalitso Mame + 4 more

Blast-induced rock fragmentation plays a critical role in mining and civil engineering. One of the primary objectives of blasting operations is to achieve the desired rock fragmentation size, which is a key indicator of the quality of the blasting process. Predicting the mean fragmentation size (MFS) is crucial to avoid increased production costs, material loss, and ore dilution. This study integrates three tree-based regression techniques—gradient boosting regression (GBR), histogram-based gradient boosting machine (HGB), and extra trees (ET)—with two optimization algorithms, namely, grey wolf optimization (GWO) and particle swarm optimization (PSO), to predict the MFS. The performance of the resulting models was evaluated using four statistical measures: coefficient of determination (R2), root mean squared error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE). The results indicate that the GWO-HGB model outperformed all other models, achieving R2, RMSE, MAE, and MAPE values of 0.9402, 0.0251, 0.0185, and 0.0560, respectively, in the testing phase. Additionally, the Shapley additive explanations (SHAP), local interpretable model-agnostic explanations (LIME), and neural network-based sensitivity analyses were applied to examine how input parameters influence model predictions. The analysis revealed that unconfined compressive strength (UCS) emerged as the most influential parameter affecting MFS prediction in the developed model. This study provides a novel hybrid intelligent model to predict MFS for optimized blasting operations in open-pit mines.

  • Research Article
  • 10.1038/s41598-025-20345-7
COVID-19 mortality and nutrition through predictive modeling and optimization based on grid search
  • Oct 6, 2025
  • Scientific Reports
  • Ahmed M Elshewey + 3 more

Since 2019, humanity has been suffering from the negative impact of COVID-19, and the virus did not stop in its usual state but began to pivot to become more harmful until it reached its form now, which is the omicron variant. Therefore, in an attempt to reduce the risk of the virus, which has caused nearly 6 million deaths to this day, it is serious to focus on one of the most important causes of disease resistance, which is nutrition. It has been proven recently that death rates dangerously depend on what enters the human stomach from fat, protein, or even healthy vegetables. This study aims to investigate a relationship between what people eat and the Covid-19 death rate. The study applies five machine learning (ML) models as follows: gradient boosting regressor (GBR), random forest (RF), lasso regression, decision tree (DT), and Bayesian ridge (BR). The study utilizes an available Covid-19 nutrition dataset which consists of 4 attributes as follows: fat percentage, caloric consumption (kcal), food supply amount (kg), and protein levels of various dietary categories for the experiment. The experiment shows the GBR model without optimization obtained optimal results during comparison with other models. The GBR model achieved a mean squared error (MSE) of 0.1512, a mean absolute error (MAE) of 0.2262, mean absolute percentage error (MAPE) of 0.1351, and r2 value of 0.963. The settings of the GBR model were refined using grid search (GS) hyperparameter optimization to find an optimal solution. This work employs evaluation strategies such as R2, MAE, MAPE and MSE to find the best-fitted model. The results displayed that the GS-GBR can enhance the performance of the original classifier compared with others from 96.3 to 99.4%. GS-optimized GBR predicts COVID-19 mortality rates better than other models, suggesting improvement in nutrition-related disease resistance predictions.

  • Preprint Article
  • 10.21203/rs.3.rs-5946945/v1
Hydro-environmental predictive management of sub-surface salinization in arid nearshore-coastal saline aquifer using deep learning and SHAP analysis
  • Mar 14, 2025
  • Research Square
  • Fahad Jibrin Abdu + 6 more

Groundwater (GW) management is vital in arid regions like Saudi Arabia, where agriculture heavily depends on this resource. Traditional GW monitoring and prediction methods often fall short of capturing the complex interactions and temporal dynamics of GW systems. This study introduces an innovative approach that integrates deep learning (DL) techniques with Shapley Additive Explanations (SHAP) to enhance GW predictive management in Saudi Arabia’s agricultural regions. SHAP analysis is used to interpret each feature’s influence on the model’s predictions, thereby improving the transparency and understanding of the models’ decision-making processes. Six different data-driven models, including Hammerstein-Wiener (HW), Random Forest (RF), Artificial Neural Networks (ANNs), eXtreme Gradient Boosting (XGBoost), Convolutional Neural Networks (CNNs), and Long Short-Term Memory (LSTM), were utilized to predict GW salinity based on electrical conductivity (EC). The calibration results suggest that the RF model exhibits the highest Determination Coefficient (DC) of 0.9903 and Nash-Sutcliffe Efficiency (NSE) of 0.9899, indicating its superior predictive accuracy, followed closely by the LSTM model with a DC of 0.9835 and NSE of 0.9827. During the validation phase, the LSTM model demonstrated superior performance with the lowest Mean Absolute Error (MAE) of 13.9547 and Mean Absolute Percentage Error (MAPE) of 0.2813, indicating minimal deviation between predicted and observed EC values. The SHAP analysis revealed that chloride (Cl), with a mean SHAP value of ~ 1250, has the highest impact on EC, suggesting that variations in chloride concentration significantly influence GW salinity. Magnesium (Mg) follows closely with a mean SHAP value of ~ 1200, highlighting its role in water hardness and EC. Sodium (Na), with a mean SHAP value of ~ 600, has a moderate impact, contributing to overall salinity from natural processes and human activities. The proposed method has proven effective, with the LSTM algorithm offering an excellent and reliable tool for predicting EC. This advancement will result in more efficient planning and decision-making related to water resources.

  • Research Article
  • 10.3389/fmed.2025.1728645
Length of postoperative stay prediction in elderly patients with hip fractures based on machine learning
  • Jan 14, 2026
  • Frontiers in Medicine
  • Yanli Hu + 5 more

BackgroundLength of postoperative stay (LOPS) is an important indicator for resource allocation and clinical management in elderly patients with hip fractures. However, previous studies have mostly dichotomized this continuous variable to determine whether it is prolonged, a practice that inherently reduces information and introduces limitations. This study aimed to develop and validate a machine learning (ML) model to accurately predict the specific LOPS in elderly patients with hip fractures.MethodsThis retrospective cohort study included electronic health records (EHRs) of elderly patients with hip fractures admitted to Yichang Central People’s Hospital from January 2016 to December 2022, with a total of 734 patients. Variables commonly measured preoperatively were extracted based on a review of previous studies, and features were selected using Pearson correlation coefficients combined with LASSO regression to construct a backpropagation neural network (BP-NN) model. For comparative evaluation, support vector machine (SVM) and random forest (RF) regression models were developed under the same dataset split (8:2), feature set, and hyperparameter optimization strategy. Model performance was assessed by comparing predicted values versus actual LOPS and calculating root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and error thresholds (20%, 30%). The feature importance of the BP-NN model was analyzed via SHapley Additive exPlanations (SHAP) values.ResultsAmong 734 elderly patients with hip fractures, 503 (68.53%) were female, with an average LOPS of 17.42± 3.77 days. Femoral neck fracture (59.26%) and hemiarthroplasty (41.96%) were the most common fracture type and surgical type, respectively. Pearson correlation analysis and LASSO regression showed that age, age-adjusted Charlson comorbidity index (ACCI), and surgical type were the predictors of LOPS. Further sensitivity analysis adjusting for confounding factors revealed that the very old elderly group (aged or above 90 years) had the longest LOPS (15.84± 0.15 days vs. 17.85± 0.14 days vs. 21.99 ± 0.66 days), with no statistically significant difference in LOPS between different surgical type subgroup (P > 0.05). The predicted values of the BP-NN were consistent with the trend of actual LOPS (R2 = 0.83), with the vast majority of prediction results falling within 30% clinically acceptable error threshold. Its RMSE, MAE and MAPE of 1.23 days, 1.57 days and 7.69% respectively. SHAP analysis revealed that ACCI and age were the main factors influencing LOPS.ConclusionThe BP-NN model, enhanced by multimethod feature selection, rigorous parameter tuning, and SHAP based interpretability, provides early and accurate LOPS prediction for elderly hip fracture patients. It can be used as a tool to assist in clinical decision-making, resource planning, and discharge preparation, without increasing the clinical burden. Future external validation across multiple centers is needed to confirm generalizability.

  • Research Article
  • 10.1016/j.mtcomm.2025.114562
Application of machine learning in predicting corrosion inhibition capacity of Spinacia oleracea leaf extract on copper
  • Jan 1, 2026
  • Materials Today Communications
  • Omotayo Sanni + 4 more

Application of machine learning in predicting corrosion inhibition capacity of Spinacia oleracea leaf extract on copper

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant