BackgroundWith the increasing severity of global water pollution, accurate prediction models of water pollution content are critical for effective environmental management. However, traditional methods often exhibit low prediction accuracy for pollutant concentrations when data samples are limited and do not adequately address data noise. This study focuses on predicting total phosphorus (TP) concentrations in the Yangtze River Basin by integrating data augmentation and denoising methods with spectral technology and deep learning, using water samples collected from Wuhan to Anhui, China. MethodThe study utilized an improved Conditional Generative Adversarial Networks (CGAN) for data augmentation, increasing dataset diversity and training effectiveness. Adaptive threshold wavelet denoising is applied to reduce noise and improve data quality. A Convolutional Neural Network (CNN) with a coordinate attention (CA) mechanism is used to extract key spectral features linked to TP concentration prediction. Significant FindingsThis study introduces an innovative approach that combines advanced CGAN-based data augmentation, adaptive threshold wavelet denoising, and a CNN model incorporating a CA mechanism, achieving high accuracy in TP concentration prediction. The proposed model outperforms traditional methods, achieving R² = 0.9805, RMSE = 0.0019, and MAE = 0.0009. This novel method significantly enhances prediction performance, providing an effective solution particularly in scenarios with limited data samples.
Read full abstract