Abstract

With the growing number of datasets to describe greenhouse gas (GHG) emissions, there is an opportunity to develop novel predictive models that require neither the expense nor time required to make direct field measurements. This study evaluates the potential for machine learning (ML) approaches to predict soil GHG emissions without the biogeochemical expertise that is required to use many current models for simulating soil GHGs. There are ample data from field measurements now publicly available to test new modeling approaches. The objective of this paper was to develop and evaluate machine learning (ML) models using field data (soil temperature, soil moisture, soil classification, crop type, fertilization type, and air temperature) available in the Greenhouse gas Reduction through Agricultural Carbon Enhancement network (GRACEnet) database to simulate soil CO2 fluxes with different fertilization methods. Four machine learning algorithms—K nearest neighbor regression (KNN), support vector regression (SVR), random forest (RF) regression, and gradient boosted (GB) regression—were used to develop the models. The GB regression model outperformed all the other models on the training dataset with R2 = 0.88, MAE = 2177.89 g C ha−1 day−1, and RMSE 4405.43 g C ha−1 day−1. However, the RF and GB regression models both performed optimally on the unseen test dataset with R2 = 0.82. Machine learning tools were useful for developing predictors based on soil classification, soil temperature and air temperature when a large database like GRACEnet is available, but these were not highly predictive variables in correlation analysis. This study demonstrates the suitability of using tree-based ML algorithms for predictive modeling of CO2 fluxes, but no biogeochemical processes can be described with such models.

Highlights

  • According to the U.S Environmental Protection Agency, CO2 is the primary anthropogenic greenhouse gas (GHG) emitted in the US, with a 30% atmospheric increase since the pre-industrial era [1], and it accounted for 80% of the total GHG emissions into the atmosphere in 2019 [2]

  • With the availability of more data comes the need for methods to use these data to understand how soil properties influence the emission of GHGs

  • We demonstrated the application of four popular machine learning (ML) algorithms (KNN regression, support vector regression, random forest regression, and gradient boosted regression) to simulate soil CO2 fluxes with available data from the GRACEnet database

Read more

Summary

Introduction

According to the U.S Environmental Protection Agency, CO2 is the primary anthropogenic greenhouse gas (GHG) emitted in the US, with a 30% atmospheric increase since the pre-industrial era [1], and it accounted for 80% of the total GHG emissions into the atmosphere in 2019 [2]. Though the greatest sources of CO2 in the US are transportation, electricity and the industrial sectors [2], agriculture accounts for substantial CO2 emissions. An increase of a few percentage points in soil carbon uptake affects the CO2 entering the atmosphere [6], decreasing the amount of GHG emissions. Estimating soil CO2 emissions is essential for understanding the feedbacks between climate changes and terrestrial ecosystems [8]. A major challenge is that direct measurement of CO2 fluxes from soil can be costly, time-consuming, and require some level of expertise, making it challenging to quantify the CO2 emissions emanating from different crop management systems, especially where soil conditions are highly variable

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call