Maintaining optimal dissolved oxygen levels is essential for aquatic ecosystems, yet industrial and domestic waste has led to a global decline in dissolved oxygen. Traditional measurement methods, such as oxygen meters and Winkler titration, are often costly or time-consuming. This study aims to improve the Root Mean Square Error, Mean Absolute Error, and R2 values for estimating dissolved oxygen levels. The research method uses Multiple Linear Regression with various training and testing data splits, both before and after applying polynomial features. The model is further optimized using a stacking technique, with Random Forest Regressor and Gradient Booster Regressor as base models.The results show that the best model was achieved using the stacking ensemble technique with a 90:10 data split and polynomial features, yielding a Root Mean Square Error of 1.206, Mean Absolute Error of 0.990, and R2 of 0.670. This model has also met the assumptions of linear regression, such as residual normality, homoscedasticity, and no autocorrelation of residuals. This study concluded that the ensemble stacking technique and the addition of polynomial features could improve the model in estimating dissolved oxygen values and also contribute by providing an accessible user interface using the Gradio Framework, allowing users to estimate dissolved oxygen levels effectively.
Read full abstract