Fly ash-based alkali-activated concrete (AAC) is renowned for its superior mechanical performance and sustainability, presenting an attractive alternative to traditional Portland cement concrete. Despite these advantages, the broad compositional range of AACs presents challenges in precisely tailoring material properties. In this context, machine learning (ML) offers promising prospects to streamline and fast-track the development of advanced materials design strategies by predicting mechanical properties from compositional variations. Effective ML model development, however, hinges on the availability of a comprehensive, high-quality dataset. Previous studies often relied on literature-derived datasets, which typically include outliers, noise, and missing values, potentially leading to biased predictions. Moreover, limited dataset sizes could undermine the robustness of the models. Traditional ML methods applied to AACs also tend to lack interpretability. To address these issues, this paper utilizes several data imputation methods and Generative Adversarial Networks (GANs) for data augmentation, effectively doubling the dataset size. Following this, ML algorithms such as Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Neural Networks (NNs) are leveraged to predict compressive strength. The NN model, especially when enhanced by k-nearest neighbors (kNN) imputation (k = 5), demonstrated superior predictive accuracy compared to RF and XGBoost models. Further, SHAP (SHapley Additive exPlanations) analysis reveals key determinants of compressive strength, such as water content, SiO2, and curing conditions. Visualizations such as SHAP violin and river flow plots further elucidated feature contributions and property distributions. Overall, this study provides a robust framework for exploring composition-strength relationships in AACs, advancing the design of these environment-friendly materials.
Read full abstract