Machine learning (ML) models have gained importance in predicting recycled aggregate concrete (RAC) properties, offering supposed benefits over conventional empirical and statistical techniques. However, whether ML models are externally generalizable for predicting RAC properties remains unanswered. This study addresses this gap by developing a systematic experimental framework for evaluating the external generalizability of ML models for predicting the compressive strength of RAC. Using a literature review, the authors created a primary dataset of 414 data points and sourced a secondary dataset comprising 330 data points from a previous paper. In Phase 1, prominent ML models like Random Forest (RF) and Extreme Gradient Boost (XGB) were tested for high-accuracy prediction on both primary and secondary datasets. A coefficient of determination (R2) as high as 0.76 for the primary dataset and 0.82 for the secondary dataset for testing sets was obtained for XGB. However, when the best-performing models of phase 1 were trained and tested with data sourced from different datasets in varying combinations, the ML model's performance significantly deteriorated (R2 < 0.25), demonstrating that ML models are not externally generalizable. The study's results highlight the trade-off between the ML model's prediction accuracy within the given dataset and its external generalizability. The study reveals that complex ML models, like XGB, may over-fit specific data, reducing their generalizability. These findings call for the intelligent usage of ML tools to identify nuanced hypotheses and promote rigorous science rather than unilaterally working on the accuracy of ML models. The study emphasizes the consideration of ML models' generalizability in the future.
Read full abstract