This study aims to imply different machine learning (ML) (artificial neural network (ANN), linear regression, XGBoost, random forest (RF), support vector machine (SVM)) models to predict and optimize biogas production yield from organic fractions of municipal solid waste (MSW). The data set for six key input variables, including moisture content, C/N ratio, lignocellulose content and MSW age was analyzed and processed using ML models. Further, data exploratory analysis (EDA) analysis was performed using various steps such as data preprocessing, feature selection, train-test-validation splitting, model testing and evaluation. The results revealed that XGBoost and RF regression model exhibited superior performance efficacy. Both the models achieved a higher R2 values of 0.88 and 0.68, and low root mean square error (RMSE) values of 305 and 496, respectively. Furthermore, feature importance analysis was performed to identify the relative significance of input variables in biogas prediction. SVM model revealed significant contributions of moisture content (20.86 %), cellulose (60.21 %), hemicellulose (34.91 %), lignin content (62.84 %), and MSW age (17.23 %) in predicting biogas production. The findings of the study can be a resource for techno-crates/researchers to develop an efficient ML based decision-making tools for accurate predictions and process optimization.
Read full abstract