Abstract

Modeling of anaerobic digestion (AD) is crucial to better understand the process dynamics and to improve the digester performance. This is an essential yet difficult task due to the complex and unknown interactions within the system. The application of well-developed data mining technologies, such as machine learning (ML) and microbial gene sequencing techniques are promising in overcoming these challenges. In this study, we investigated the feasibility of 6 ML algorithms using genomic data and their corresponding operational parameters from 8 research groups to predict methane yield. For classification models, random forest (RF) achieved accuracies of 0.77 using operational parameters alone and 0.78 using genomic data at the bacterial phylum level alone. The combination of operational parameters and genomic data improved the prediction accuracy to 0.82 (p<0.05). For regression models, a low root mean square error of 0.04 (relative root mean square error =8.6%) was acquired by neural network using genomic data at the bacterial phylum level alone. Feature importance analysis by RF suggested that Chloroflexi, Actinobacteria, Proteobacteria, Fibrobacteres, and Spirochaeta were the top 5 most important phyla although their relative abundances were ranging only from 0.1% to 3.1%. The important features identified could provide guidance for early warning and proactive management of microbial communities. This study demonstrated the promising application of ML techniques for predicting and controlling AD performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.