Biogas yield in anaerobic digestion (AD) involves continuous and complex biological reactions. The traditional linear models failed to quantitatively assess the interactive effects of these factors on AD performance. To further explore the internal relationship between target variables and AD performance, this study developed four machine learning models to predict biogas yield and consider the interaction among various factors. Results indicated that the highest prediction accuracy of AD performance was achieved by adding bacterial genera dataset with environmental factors. Random forest model exhibited the highest accuracy, with the testing coefficient of determination equal to 0.9879. Among two types of input features, the bacterial genera accounted for 89.9 % of the impact on biogas yield, followed by environmental factors. The results revealed Keratinibaculum and Acetomicrobium as critical bacteria. The volatile fatty acid controlled below 2000 mg/L and the improved stirring system in AD process were recommended to achieve maximum biogas yield.
Read full abstract