Abstract
In predicting palm oil mill effluent (POME) degradation efficiency, previous developed quadratic model quantitatively evaluated the effects of O2 flowrate, TiO2 loadings and initial concentration of POME in labscale photocatalytic system, which however suffered from low generalization due to the overfitting behaviour. Evidently, high RMSE (131.61) and low R2 (−630.49) obtained indicates its insufficiency in describing POME degradation at unseen factor ranges, hence verified the fact of poor generalization. To overcome this issue, several models were developed via machine learning-assisted techniques, namely Gaussian Process Regression (GPR), Linear Regression (LR), Decision Tree (DT), Supported Vector Machine (SVM) and Regression Tree Ensemble (RTE), subsequently being assessed systematically. To achieve high generalization, all models were subjected to ‘train-all-test-all’ strategy, 5-fold and 10-fold cross validation. Specifically, GPR model was furnished with high accuracy in ‘train-all-test-all’ strategy, judging from its low RMSE (1.0394) and high R2 (0.9962), which however menaced by the risk of overfitting. In contrast, despite relatively poorer RMSE and R2 (1.7964 and 0.9886) obtained in 5-fold cross validation, GPR model was rendered with highest generalization, while sufficiently preserving its accuracy in development process. Besides, SVM and RTE models were also demonstrated promising R2 (0.9372 and 0.9208), which however shadowed by their high RMSEs (4.2174 and 4.7366). Furthermore, the extraordinary generalization of GPR model was coincidentally verified in 10-fold cross validation. The lowest RMSE (2.1624) and highest R2 (0.9835) obtained with feature number of 36 asserted its sufficiency in both generalization and accuracy prospect. Other models were all rendered with slight lower R2 (> 0.9), plausibly due to the higher RMSE (> 4.0). According to GPR model, optimized POME degradation (52.52%) can be obtained at 70 mL/min of O2, 70.0 g/L of TiO2 and 250 ppm of POME concentration, with only ∼3% error as compared to the actual data.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.