Ultraviolet-visible (UV–Vis) absorption spectroscopy for in situ water quality sensing has garnered increasing attention. However, the selection of the characteristic wavelengths for water quality indicators has been underexplored in existing studies, resulting in surrogate monitoring models with low accuracy and high complexity. This research used field data from the Maozhou River in Shenzhen. The accuracy of the surrogate model based on the wavelength selection method is 134.8%, 52.5%, and 13.5% improvement in accuracy compared to the single wavelength method, the PCA method, and the full spectrum method, respectively. We investigate seven characteristic wavelength optimisation algorithms and five machine learning models for surrogate monitoring of five water quality indicators: TOC, BOD5, COD, TN, and NO3-N. The results indicate that the competitive adaptive reweighted sampling (CARS) method for wavelength selection, combined with ridge regression as a surrogate monitoring model, achieved the best performance in this study. The determination coefficient (R2) of the five water quality indicators were 0.80, 0.64, 0.82, 0.97, and 0.96, respectively. The study shows that for watersheds with relatively stable water chemical components, there is no need to use overly complex nonlinear models, and the regression model with characteristic wavelength selection can achieve good prediction results. This study provides detailed technical information on river water quality spectral surrogate monitoring, offering an important practice reference.
Read full abstract