The classification of agricultural products is of great importance for quality control, optimized marketing, efficient logistics, research progress, consumer satisfaction, and sustainability. Dragon fruit has many varieties that need to be identified quickly and accurately for packaging and marketing. Considering the increasing demand for dragon fruit, it is obvious that an automated classification system has significant commercial and scientific value by increasing sorting efficiency and reducing manual labor costs. This study aimed to classify four commonly produced dragon fruit varieties according to their color, mechanical, and physical properties using machine learning models. Data were collected from 224 dragon fruits (53 American beauty, 57 Dark star, 65 Vietnamese white, and 49 Pepino dulce variety). Classification was performed using measurable physical and mechanical properties obtained through digital image processing, colorimetry, electronic weighing, and stress–strain testing. These methods provided objective and reproducible data collection for the models. Three models—Random Forest, Gradient Boosting, and Support Vector Classification—were implemented and their performances were evaluated using accuracy, precision, recall, Matthews’s correlation coefficient, Cohen’s Kappa, and F1-Score. The Random Forest model showed the highest performance in all metrics, achieving 98.66% accuracy, while the Support Vector Classification model had the lowest success. The superior performance of the Random Forest model can be attributed to its ability to handle complex, nonlinear relationships among multiple variables while preventing overfitting through ensemble learning. However, potential challenges in dragon fruit classification include variations due to environmental factors, genetic variation, and hybridization. Future research can focus on incorporating biochemical or genetic markers and improving real-time classification for industrial applications.
Read full abstract