Groundwater, a pivotal water resource in numerous regions worldwide, confronts formidable challenges posed by severe nitrate pollution. Traditional research methodologies aimed at addressing groundwater nitrate contamination frequently struggle to accurately depict the intricate conditions of the groundwater environment, particularly when dealing with high variability and nonlinear data. However, the advent of machine learning (ML) has heralded an innovative approach to simulating groundwater dynamics. In this study, six ML algorithms were deployed to model the concentrations of shallow groundwater nitrates in the Shaying River Basin. The efficacy of each model was assessed through comprehensive metrics including the coefficient of determination (R2), mean absolute error (MAE), and root mean square error (RMSE), gauging the alignment between observed and predicted groundwater nitrate levels. Subsequently, to discern the principal environmental factors influencing NO3-N concentrations, the most proficient model was selected. Among the array of models, the XGB algorithm, renowned for its capacity to handle extreme values, demonstrated superior performance (R2 = 0.773, MAE = 7.625, RMSE = 11.92). Through an in-depth analysis of groundwater NO3-N across major urban centers, Fuyang city was identified as the most heavily contaminated locale, attributing the phenomenon to potential sources such as domestic sewage and agricultural activities (feature importance of Cl- = 78.64%). Conversely, Zhengzhou city emerged as the least polluted city, with notable influences from K+ and NO2 - (feature importance = 52.06% and 18.41%), indicative of a prevailing reducing environment compared to other cities. In summation, this study explores a methodology for amalgamating diverse environmental variables in the investigation of groundwater contamination. Such insights hold profound implications for the effective management and mitigation of nitrate contamination in the Shaying River Basin, offering a demonstration for similar endeavors in analogous regions. PRACTITIONER POINTS: Six machine learning models were utilized to simulate the nitrate contamination. XGB model for groundwater nitrate pollution prediction outperformed other models. Relative importance of environmental variables was identified using the XGB model. Impact of main environmental variables on groundwater nitrate was discussed.
Read full abstract