Rapid urbanization in China since 1980 generated environmental pressures of non-point source pollution (NPSP) and increased wide public concerns. Excessive quantities of nitrogen (N) and phosphorus (P) is a significant source of aquatic pollution, despite of their roles as essential nutritional elements for aquatic life processes. In this study, we present a new framework using random forest (RF) as a powerful machine learning algorithm driven by geo-datasets to estimate and map the concentration of total nitrogen (TN) and phosphorus (TP) at a spatial resolution for the Wen-Rui Tang River (WRTR) watershed, which is a typically urban-rural transitional area in east coastal region of China. A comprehensive GIS database of 26 in-house built environmental variables was adopted to build the predictive models of TN and TP in open waters over the watershed. The performances of the RF regression models were evaluated in comparison with in-situ measurements, and the results indicated the ability of RF regression models to accurately predict the spatiotemporal distribution of N and P concentration in rivers. Charactering the explanatory variable importance measures in the calibrated RF regression model defined the most significant variables impacting N and P contaminations in open waters across the urban-rural transitional area, and the results showed that these variables are aquaculture, direct domestic sewage, industrial wastewater discharges and the changing meteorological variables. Besides, mapping of the TN and TP concentrations across the continuous river at high spatiotemporal resolution (daily, 1 km × 1 km) in this study were informative. The results in this study provided the valuable data to various different stakeholders for managing water quality and pollution control where similar regions with rapid urbanization and a lack of water quality monitoring datasets.
Read full abstract