In the present study, a new approach by coupling the interpolation method with computation-based technique (data-mining algorithms and an optimization algorithm) is introduced for modeling and optimization removal of Reactive Orange 7 (RO7) dye removal from synthetic wastewater. To this end, four significant factors like pH, electrolyte concentration, current density, and electrolysis time are considered as input variables. Thus, modeling of RO7 removal is implemented using eight data mining algorithms including multi- variate linear regression (MLR), ridge regression (RR), multivariate nonlinear regression (MNLR), artificial neural network (ANN), classification and regression tree (CART), k nearest neighbor (KNN), random forest (RF), and support vector machine (SVM). These al- gorithms require a large data set for creating reliable results. However, creating a large number of experimental data request consuming high cost and time. Hence, the interpolation methods of kriging (KRG) and inverse distance weight (IDW) are applied for generating more data, whereas KRG has more accuracy than IDW by increasing the 47.080, 36.914, and 1.77% in MAE, RMSE, and R values, res- pectively. Then, the data mining algorithms are used for modeling the decolorization efficiency (DE) based on the original data and new data from KRG. It is found that using new data leads to significantly increasing accuracy (94.47, 96.43, 1.52, and 2.77% for MAE, RMSE, R and R2, respectively) of DE modeling. Also, SVM has demonstrated the highest accuracy out of all data mining algorithms (by in- creasing the 97.13, 98.30, and 14.42% in MAE, RMSE, and R2 values, respectively). Another challenge in the removal of RO7 from synthetic wastewater is predicting the maximum removal amount and optimal input variables. For this purpose, the hybrid of SVM and whale optimization algorithm (WOA) is employed. Finally, SVM-WOA has predicted the maximum of DE (91%) by optimal values of 4.2, 1.5 g/L, 4.2 mA/cm2, and 18 min for pH, C, I, and Time, respectively. In light of the high performance of the introduced approach for modeling removal process and predicting optimal conditions of removal process, this approach can be suggested for the removal of other pollutants from wastewater when the number of experimental data set is limited.
Read full abstract