Urban reservoirs are important for drinking water services and urban living. However, potentially toxic cyanobacteria blooms are frequently present due to human pollution and might threaten the urban water supply. Conveniently, cyanobacteria can be monitored by remote sensing-based approaches based on the spectral features of C-Phycocyanin (PC). Furthermore, methods leveraging Machine Learning Algorithms (MLA) for PC estimation from hyperspectral data have highlighted the potential to estimate PC more accurately - even at low concentrations. Since relatively few methodologies for PC retrieval in tropical environments have been developed or validated, this research evaluated PRISMA hyperspectral data processed with three MLA (Random Forest, Extreme Gradient Boost, and Support Vector Machines) to estimate PC concentrations in the Billings reservoir, Brazil. The same MLA were used to generate PC models using Wordview-3 and Landsat-8/OLI simulated data to assess the potential gain of using hyperspectral over multispectral data. A PRISMA image was processed with three atmospheric correction methods and validated with co-located in-situ data, where the best atmospherically corrected product was used to generate synthetic Landsat-8/OLI and Worldview-3 images. The PC models were calibrated and validated through Monte Carlo simulation using field radiometric and biological data (Chlorophyll-a, PC, and phytoplankton taxonomy) collected in eight field campaigns (N = 115). The PRISMA and the synthetic multispectral images were used for a second round of models’ validation using co-located PC measurements (match-up window ± 4 h). The global PC Mixture Density Network was also applied to the PRISMA data, and the estimates were compared with the other MLA. The results showed that the standard PRISMA surface reflectance product provided the best atmospheric correction (MAE < 20% for the 500–700 nm bands), while ACOLITE and 6SV underperformed it from two to more than ten-fold. Cyanobacteria species were abundant in 96% of the taxonomical samples, even though relatively low PC concentrations were found (PC from 0 to 301.81 μg/L and median PC = 2.9 μg/L). The global Mixture Density Network sharply overestimated PC (MAE = 280% and Bias = 280%), potentially due to Billings reservoir’s low PC:Chlorophyll-a ratio relative to the original training dataset. PRISMA/Random Forest (MAE = 45%) achieved the lowest error for orbital PC estimate, while Extreme Gradient Boost outperformed the other MLA using Worldview-3 (MAE = 49%) and Landsat-8 (MAE = 74%) synthetic imagery. Therefore, the results suggest hyperspectral and multispectral orbital data aligned with MLA are feasible for monitoring PC, even for waters containing low PC concentrations and reduced PC:Chlorophyll-a ratios.
Read full abstract