Continuous long-term eddy covariance (EC) measurements of CO2 fluxes (NEE) in a variety of terrestrial ecosystems are critical for investigating the impacts of climate change on ecosystem carbon cycling. However, due to a number of issues, approximately 30–60% of annual flux data obtained at EC flux sites around the world are reported as gaps. Given that the annual total NEE is mostly determined by variations in the NEE data with time scales longer than one day, we propose a novel framework to perform gap filling in NEE data based on machine learning (ML) and time series decomposition (TSD). The novel framework combines the advantages of ML models in predicting NEE with meteorological and environmental inputs and TSD methods in extracting the dominant varying trends in NEE time series. Using the NEE data from 25 AmeriFlux sites, the performance of the proposed framework is evaluated under four different artificial scenarios with gap lengths ranging in length from one hour to two months. The combined approach incorporating random forest and moving average (MA-RF) is observed to exhibit better performance than other approaches at filling NEE gaps in scenarios with different gap lengths. For the scenario with a gap length of seven days, the MA-RF improves the R2 by 34% and reduces the root mean square error (RMSE) by 55%, respectively, compared to a traditional RF-based model. The improved performance of MA-RF is most likely due to the reduction in data variability and complexity of the variations in the extracted low-frequency NEE data. Our results indicate that the proposed MA-RF framework can provide improved gap filling for NEE time series. Such improved continuous NEE data can enhance the accuracy of estimations regarding the ecosystem carbon budget.
Read full abstract