Abstract
Accurately predicting the characteristics of effluent, discharged from wastewater treatment plants (WWTPs) is crucial for reducing sampling requirements, labor, costs, and environmental pollution. Machine learning (ML) techniques can be effective in achieving this goal. To optimize ML-based models, various feature selection (FS) methods are employed. This study aims to investigate the impact of six FS methods (categorized as Wrapper, Filter, and Embedded methods) on the accuracy of three supervised ML algorithms in predicting total suspended solids (TSS) concentration in the effluent of a municipal wastewater treatment plant. Based on the features proposed by each FS method, five distinct scenarios were defined. Within each scenario, three ML algorithms, namely artificial neural network-multi layer perceptron (ANN-MLP), K-nearest neighbors (KNN), and adaptive boosting (AdaBoost) were applied. The features utilized for predicting TSS concentration in the WWTP effluent included BOD5, COD, TSS, TN, NH3 in the influent, and BOD5, COD, residual Cl2, NO3, TN, NH4 in the effluent. To construct the models, the dataset was randomly divided into training and testing subsets, and K-fold cross-validation was employed to control overfitting and underfitting. The evaluation metrics that are used are root mean squared error (RMSE), mean absolute error (MAE), and correlation coefficient (R2). The most efficient scenario was identified as Scenario IV, with the Sequential Backward Selection FS method. The features selected by this method were CODe, BOD5e, BOD5i, TNi. Furthermore, the ANN-MLP algorithm demonstrated the best performance, achieving the highest R2 value. This algorithm exhibited acceptable performance in both the training and testing subsets (R2 = 0.78 and R2 = 0.8, respectively).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.