Abstract
Coffee is a global commodity, with Brazil being a major producer, particularly in the Minas Gerais state. This study applied machine learning to predict the Arabica coffee yield in the region, analyzing two groups of cultivars (G1 and G2) using data from 1993 to 2020. The Factor Analysis of Mixed Data (FAMD) was employed to explore the relationships between climatic factors, management practices, and the coffee yield. Four machine learning models, such as Multiple Linear Regression (MLR), Random Forest (RF), XGBoost (XGB), and Support Vector Machines (SVM) were calibrated and evaluated for yield prediction. The FAMD revealed complex interactions among variables, requiring four principal components to explain approximately 64.6% of the total variance. Management practices, such as the planting density and pruning, had a stronger influence on G1 cultivars, while G2 cultivars were more sensitive to climatic conditions, particularly the air temperature. Among the machine learning models, RF and XGB performed best in the yield estimation, whereas MLR and SVM were less effective, particularly for values above 60 bags ha−1 (1 bag = 60 kg). These findings underscore the variability in the yield across cultivars and demonstrate the potential of machine learning to guide tailored management strategies for different coffee cultivars.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have