ABSTRACT Autoclaved-citrate extractable soil protein (ACE protein, hereafter referred as “soil protein”) is a novel biological soil health indicator that can indirectly capture a soil’s capacity to supply nitrogen (N) but is relatively expensive to assess. To explore cost saving options, a dataset of 4,171 soil samples with texture, total carbon (C) and N, carbon-to-nitrogen ratio (C/N), soil protein, permanganate-oxidizable carbon (POXC), pH, and extractable magnesium (Mg) and iron (Fe), was used to develop three pedotransfer functions for soil protein. These included a full random forest (RF) model utilizing all variables, and a reduced RF model and a multiple linear regression model employing a subset of the variables. Models were validated using a US subset of the North American Project to Evaluate Soil Health Measurements dataset that contained 1,406 samples. The full RF model for soil protein reduced the root mean square error (RMSE) by 41.7 and 53.4% compared to reduced RF and multiple linear regression models, respectively. Total C was a more important variable in the model than total N. Additionally, POXC, sand, clay, and extractable Mg and Fe were found to be important in the model. Soil protein was sensitive to management at 36 of 57 long-term experiments. The full RF model was able to replicate 92% of those significant effects of management on soil protein. The new RF pedotransfer function for soil protein can improve prediction compared to traditional regression techniques and reduce the cost of comprehensive soil health assessment.
Read full abstract