In recent decades, phytoplankton proliferation and sediment input to rivers (especially urban rivers) have become more dramatic under the compound pressure of climate change and human activities. Given the generally narrow width of rivers and current high spatial resolution satellites, which are limited by band settings, bandwidth, and the signal-to-noise ratio, UAVs with their exceptional spatiotemporal resolution can be used as a useful tool for river environmental monitoring and inversion uncertainty assessment. In this study, UAV-based hyperspectral (X20P) and multispectral (P4M) images, along with Sentinel-2 MultiSpectral Instrument (MSI), Landsat-8 Operational Land Imager (OLI) and Landsat-9 OLI2 data, were used to assess the uncertainty in retrieving chlorophyll-a (Chla) and suspended sediment (SS) concentrations in rivers. Chla and SS models based on UAV and satellite data were constructed using stepwise multiple regression and typical Chla and SS retrieval algorithms, respectively, and the performance of the models was the focus of our research. The results demonstrated that in the Chla concentration inversion, each sensor performed as follows: X20P > P4M > Landsat9 OLI2 > Sentinel-2 MSI > Landsat8 OLI, and the performance in the SS concentration inversion was as follows: X20P > Sentinel-2 MSI > P4M > Landsat9 OLI2 > Landsat8 OLI. In addition, the uncertainty of high spatial resolution satellite retrievals was analyzed with the assistance of the UAV-based model. Results showed that narrow bandwidths and finely tuned band settings are more essential for the Chla inversion. The typical Chla retrieval algorithm, NDCI, is only effective in certain bands (band 1 from 684 to 724 nm and band 2 from 660 to 680 nm). It is also noted that Landsat8 and Landsat9 lack some key band settings (e.g., the red-edge band of 700–710 nm), severely limiting practical application in relation to Chla. However, specific variances in different sensor bands have a relatively small impact on SS inversion, for example, the correlation between SS and the R/B (a typical SS retrieval algorithm) constructed by each sensor ranged from 0.68 to 0.77. Chla monitoring, on the other hand, necessitates a higher spatial resolution than SS monitoring. The accuracy decreased markedly when UAV images were resampled to 10 m and 30 m spatial resolution. However, it is not as crucial for the SS inversion, images with the original spatial resolution (RMSE<30cm = 6.28 mg/L) were resampled to 10 m resolution (RMSE10m = 5.85 mg/L) and 30 m resolution (RMSE30m = 4.08 mg/L) while using P4M for SS inversion, and the accuracy increased. Our results demonstrated and highlighted various options for future monitoring of Chla and SS, while exploiting the synergy between UAVs and satellites to achieve more precise observations at greater spatial and temporal scales, which will benefit aquatic environment management and protection.