Soil organic carbon (SOC) is an important indication of soil health and helps to sustain soil fertility. As a result, determining its composition and the factors that influence it is critical for long-term soil nutrient management, especially in controlled conditions such as greenhouses. This study utilizes machine learning to classify SOC content in greenhouses built on pyroclastic deposits in the Isparta region. A dataset of 276 samples and eight variables—clay (%), silt (%), sand (%), soil electrical conductivity (EC), pH, elevation, slope, and aspect—were used to model SOC values. SOC content was classified into five classifications: very low (2.3%). In this study, five machine learning models—Logistic Regression (LR), K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Decision Tree (DT), and Random Forest (RF)—were evaluated using cross-validation to determine their classification accuracy, precision, recall, F-score, and ROC area. Random Forest (RF) and Decision Tree (DT) outperformed the other models, with RF achieving the highest overall accuracy (76.4%), precision (77.3%), and AUC (0.904), followed by DT at 75.4% and AUC of 0.874. This study shows the practicality of machine learning models in categorizing SOC content, highlighting their importance for long-term soil health and fertility control in greenhouse conditions. To improve model efficacy, future studies should include more auxiliary variables, such as soil physical and chemical qualities and lithological data, as well as a wider range of soil types.
Read full abstract