Ensemble Learning Techniques Research Articles

This paper proposes the utility of interpretable ensemble learning models for predicting the mechanical properties (bulk, shear and Young moduli) of ABX3 perovskite compounds with the A, B, and X referring to the 3 elements that make the cubic 3-dimensional framework of the perovskite compounds. These models consist of 3 ensemble learning techniques namely CatBoost, Random Forest, and XGBoost. To expand the feature space, robust first-principles density functional theory calculations were used to generate some of the input features, namely elastic constants, density, volume per atom, and ground state energy per atom. The order of the input feature ranking that influences the machine learning (ML) model decisions was then determined. For this, we performed correlation analysis on the multi-dimensional input feature space, suppressed features with high collinearity, and selected features with limited correlation. We trained the three ensemble learning techniques on the desired vectorial input feature representation to predict the mechanical properties. Furthermore, we employed the Shapley Additive Explanations (SHAP) algorithm for analysing the intrinsic decision-making rationality of the ensemble learning models. We measured the performance in the context of the error metrics and coefficient of determination, R2. The results show that XGBoost outperforms other approaches when predicting the shear modulus or Young modulus of the perovskite compounds yielding the least error metrics and the highest R2 value (0.97) in the testing phase. However, both CatBoost and Random Forest outperformed XGBoost when attempting to predict the bulk modulus in the testing phase. The deficiency of the XGBoost in predicting the bulk modulus can be ascribed to an overfitting problem which can occur when the ML model gives accurate predictions for training data but not for test data. Furthermore, the SHAP algorithm provides an insight into the order of feature importance (from highest to lowest). Additionally, we conducted a post-analysis using a holistic ranking to analyse the relative importance of the SHAP feature impact comprehension for the examined ensemble learning techniques. Our findings indicate that the elastic constants are the most important input features influencing the predictive decision of the ensemble learning models.

Read full abstract

Groundwater is a primary source of drinking water for billions worldwide. It plays a crucial role in irrigation, domestic, and industrial uses, and significantly contributes to drought resilience in various regions. However, excessive groundwater discharge has left many areas vulnerable to potable water shortages. Therefore, assessing groundwater potential zones (GWPZ) is essential for implementing sustainable management practices to ensure the availability of groundwater for present and future generations. This study aims to delineate areas with high groundwater potential in the Bankura district of West Bengal using four machine learning methods: Random Forest (RF), Adaptive Boosting (AdaBoost), Extreme Gradient Boosting (XGBoost), and Voting Ensemble (VE). The models used 161 data points, comprising 70% of the training dataset, to identify significant correlations between the presence and absence of groundwater in the region. Among the methods, Random Forest (RF) and Extreme Gradient Boosting (XGBoost) proved to be the most effective in mapping groundwater potential, suggesting their applicability in other regions with similar hydrogeological conditions. The performance metrics for RF are very good with a precision of 0.919, recall of 0.971, F1-score of 0.944, and accuracy of 0.943. This indicates a strong capability to accurately predict groundwater zones with minimal false positives and negatives. Adaptive Boosting (AdaBoost) demonstrated comparable performance across all metrics (precision: 0.919, recall: 0.971, F1-score: 0.944, accuracy: 0.943), highlighting its effectiveness in predicting groundwater potential areas accurately; whereas, Extreme Gradient Boosting (XGBoost) outperformed the other models slightly, with higher values in all metrics: precision (0.944), recall (0.971), F1-score (0.958), and accuracy (0.957), suggesting a more refined model performance. The Voting Ensemble (VE) approach also showed enhanced performance, mirroring XGBoost's metrics (precision: 0.944, recall: 0.971, F1-score: 0.958, accuracy: 0.957). This indicates that combining the strengths of individual models leads to better predictions. The groundwater potentiality zoning across the Bankura district varied significantly, with areas of very low potentiality accounting for 41.81% and very high potentiality at 24.35%. The uncertainty in predictions ranged from 0.0 to 0.75 across the study area, reflecting the variability in groundwater availability and the need for targeted management strategies.In summary, this study highlights the critical need for assessing and managing groundwater resources effectively using advanced machine learning techniques. The findings provide a foundation for better groundwater management practices, ensuring sustainable use and conservation in Bankura district and beyond.

Read full abstract

Ensemble Learning Techniques Research Articles

Related Topics

Articles published on Ensemble Learning Techniques

Ensemble learning approach for distinguishing human and computer-generated Arabic reviews

Screening of Key Transcripts from Expression Data Using Applied Artificial Intelligence for Cancer Prediction

Discovering the underground coal mining accident patterns in Spain from 2003 to 2021: Insights through machine learning techniques

A Comprehensive Review of SCADA-Based Wind Turbine Performance and Reliability Modeling with Machine Learning Approaches

Ensemble learning using Gompertz function for leukemia classification

Conventional Machine Learning and Ensemble Learning Techniques in Cardiovascular Disease Prediction and Analysis

Explainable Ensemble Learning Approaches for Predicting the Compression Index of Clays

Addressing data sparsity and cold-start challenges in recommender systems using advanced deep learning and self-supervised learning techniques

Cancer detection with various classification models: A comprehensive feature analysis using HMM to extract a nucleotide pattern

Interpretable machine learning methods to predict the mechanical properties of ABX3 perovskites

LSTM-Autoencoder Based Detection of Time-Series Noise Signals for Water Supply and Sewer Pipe Leakages

Global decline in microbial-derived carbon stocks with climate warming and its future projections

Improving Hate Speech Classification Through Ensemble Learning and Explainable AI Techniques

Attention-Driven Transfer Learning Model for Improved IoT Intrusion Detection

Application of bagging and boosting ensemble machine learning techniques for groundwater potential mapping in a drought-prone agriculture region of eastern India

Enhancing IoT network defense: advanced intrusion detection via ensemble learning techniques

An ensemble machine learning-based approach to predict thyroid disease using hybrid feature selection

Data analytics in ensemble learning for effective crop yield prediction

Comparative Analysis of Ensemble Learning Techniques for Purchase Prediction in Digital Promotion through Social Network Advertising

Environment Aspects and Daily Life-Threatening Risk Prediction for Improving Public Health Using Ensemble Learning Techniques

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Ensemble Learning Techniques Research Articles

Related Topics

Articles published on Ensemble Learning Techniques

Ensemble learning approach for distinguishing human and computer-generated Arabic reviews

Screening of Key Transcripts from Expression Data Using Applied Artificial Intelligence for Cancer Prediction

Discovering the underground coal mining accident patterns in Spain from 2003 to 2021: Insights through machine learning techniques

A Comprehensive Review of SCADA-Based Wind Turbine Performance and Reliability Modeling with Machine Learning Approaches

Ensemble learning using Gompertz function for leukemia classification

Conventional Machine Learning and Ensemble Learning Techniques in Cardiovascular Disease Prediction and Analysis

Explainable Ensemble Learning Approaches for Predicting the Compression Index of Clays

Addressing data sparsity and cold-start challenges in recommender systems using advanced deep learning and self-supervised learning techniques

Cancer detection with various classification models: A comprehensive feature analysis using HMM to extract a nucleotide pattern

Interpretable machine learning methods to predict the mechanical properties of ABX3 perovskites

LSTM-Autoencoder Based Detection of Time-Series Noise Signals for Water Supply and Sewer Pipe Leakages

Global decline in microbial-derived carbon stocks with climate warming and its future projections

Improving Hate Speech Classification Through Ensemble Learning and Explainable AI Techniques

Attention-Driven Transfer Learning Model for Improved IoT Intrusion Detection

Application of bagging and boosting ensemble machine learning techniques for groundwater potential mapping in a drought-prone agriculture region of eastern India

Enhancing IoT network defense: advanced intrusion detection via ensemble learning techniques

An ensemble machine learning-based approach to predict thyroid disease using hybrid feature selection

Data analytics in ensemble learning for effective crop yield prediction

Comparative Analysis of Ensemble Learning Techniques for Purchase Prediction in Digital Promotion through Social Network Advertising

Environment Aspects and Daily Life-Threatening Risk Prediction for Improving Public Health Using Ensemble Learning Techniques