Abstract
Abstract. The errors and uncertainties associated with gap-filling algorithms of water, carbon, and energy fluxes data have always been one of the main challenges of the global network of microclimatological tower sites that use the eddy covariance (EC) technique. To address these concerns and find more efficient gap-filling algorithms, we reviewed eight algorithms to estimate missing values of environmental drivers and nine algorithms for the three major fluxes typically found in EC time series. We then examined the algorithms' performance for different gap-filling scenarios utilising the data from five EC towers during 2013. This research's objectives were (a) to evaluate the impact of the gap lengths on the performance of each algorithm and (b) to compare the performance of traditional and new gap-filling techniques for the EC data, for fluxes, and separately for their corresponding meteorological drivers. The algorithms' performance was evaluated by generating nine gap windows with different lengths, ranging from a day to 365 d. In each scenario, a gap period was chosen randomly, and the data were removed from the dataset accordingly. After running each scenario, a variety of statistical metrics were used to evaluate the algorithms' performance. The algorithms showed different levels of sensitivity to the gap lengths; the Prophet Forecast Model (FBP) revealed the most sensitivity, whilst the performance of artificial neural networks (ANNs), for instance, did not vary as much by changing the gap length. The algorithms' performance generally decreased with increasing the gap length, yet the differences were not significant for windows smaller than 30 d. No significant differences between the algorithms were recognised for the meteorological and environmental drivers. However, the linear algorithms showed slight superiority over those of machine learning (ML), except the random forest (RF) algorithm estimating the ground heat flux (root mean square errors – RMSEs – of 28.91 and 33.92 for RF and classic linear regression – CLR, respectively). However, for the major fluxes, ML algorithms and the MDS showed superiority over the other algorithms. Even though ANNs, random forest (RF), and eXtreme Gradient Boost (XGB) showed comparable performance in gap-filling of the major fluxes, RF provided more consistent results with slightly less bias against the other ML algorithms. The results indicated no single algorithm that outperforms in all situations, but the RF is a potential alternative for the MDS and ANNs as regards flux gap-filling.
Highlights
To address the global challenges of climatological and ecological changes, environmental scientists and policy makers are demanding data that are continuous in time and space
The weaker performance of the elastic net regularisation (ELN) compared to classic linear regression (CLR) was unforeseen as by adding two penalty components to the regression line, the ELN is supposed to improve the longterm prediction compared to the traditional linear regression methods
Despite no meaningful difference based on Tukey’s HSD, XGB and random forest (RF) might have performed better than artificial neural networks (ANNs), as the superiority of RF in gap-filling the methane flux over the ANNs, support vector regression (SVR), and marginal distribution sampling (MDS) has recently been claimed by Kim et al (2020)
Summary
To address the global challenges of climatological and ecological changes, environmental scientists and policy makers are demanding data that are continuous in time and space. A. Mahabbati et al.: A comparison of gap-filling algorithms for eddy covariance fluxes and their drivers national and international flux networks as well as global Earth-observing systems. Mahabbati et al.: A comparison of gap-filling algorithms for eddy covariance fluxes and their drivers national and international flux networks as well as global Earth-observing systems Satellites partially fill this gap as they provide excellent spatial coverage but have limited temporal resolution and do not measure at a point scale. High-quality long-term site observations of ecosystem processes and fluxes are needed that are continuous in time and space. Filling data gaps and reducing uncertainties through better gap-filling techniques are highly needed
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Geoscientific Instrumentation, Methods and Data Systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.