Abstract

Abstract. Nutrient data from catchments discharging to receiving waters are monitored for catchment management. However, nutrient data are often sparse in time and space and have non-linear responses to environmental factors, making it difficult to systematically analyse long- and short-term trends and undertake nutrient budgets. To address these challenges, we developed a hybrid machine learning (ML) framework that first separated baseflow and quickflow from total flow, generated data for missing nutrient species, and then utilised the pre-generated nutrient data as additional variables in a final simulation of tributary water quality. Hybrid random forest (RF) and gradient boosting machine (GBM) models were employed and their performance compared with a linear model, a multivariate weighted regression model, and stand-alone RF and GBM models that did not pre-generate nutrient data. The six models were used to predict six different nutrients discharged from two study sites in Western Australia: Ellen Brook (small and ephemeral) and the Murray River (large and perennial). Our results showed that the hybrid RF and GBM models had significantly higher accuracy and lower prediction uncertainty for almost all nutrient species across the two sites. The pre-generated nutrient and hydrological data were highlighted as the most important components of the hybrid model. The model results also indicated different hydrological transport pathways for total nitrogen (TN) export from two tributary catchments. We demonstrated that the hybrid model provides a flexible method to combine data of varied resolution and quality and is accurate for the prediction of responses of surface water nutrient concentrations to hydrologic variability.

Highlights

  • Surface water nutrient concentrations have been significantly increased by human activities (Forio et al, 2015) due to urbanisation, waste discharges and agricultural intensification (Liu et al, 2012; Kaiser et al, 2013; Li et al, 2013)

  • To utilise all available nutrient data and assess the impact of different transport pathways on stream nutrient concentrations, we developed a hybrid machine learning framework for surface water nutrient concentrations (ML-SWAN) that first separated baseflow and quickflow from total flow and built intermediate models to generate missing nutrient species within the total nutrient pool, using relationships with baseflow, quickflow, rainfall and seasonal components

  • The scaled root mean squared error (RMSE) reduced from linear models (LMs), WRTDS, stand-alone ML and hybrid ML for all nutrients except NH4, and the same pattern was found for model efficiency coefficient (MEF) in both Ellen Brook and Murray River (Fig. 3)

Read more

Summary

Introduction

Surface water nutrient concentrations have been significantly increased by human activities (Forio et al, 2015) due to urbanisation, waste discharges and agricultural intensification (Liu et al, 2012; Kaiser et al, 2013; Li et al, 2013). Increased nutrient concentrations and loads in streams alter the biogeochemical functioning and biological community structure in receiving estuaries (Jickells et al, 2014; Staehr et al, 2017), leading to an increased incidence of harmful algal blooms (Domingues et al, 2011), anoxia and hypoxia (Li et al, 2016; Testa et al, 2017) and reduced water availability (Heathwaite, 2010). Analysis of tributary water quality data over time is essential to compute incoming nutrient loads, support policy and plan remediation measures. Nutrients can derive from different sources (point or nonpoint) in the landscape and are transported to receiving waters through different water pathways subject to varied

Objectives
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call