This study aims to improve the efficiency of automatic classification and quality control of fruits and vegetables through image recognition technology, to achieve efficient and accurate intelligent sorting in agricultural production, reduce labor costs and improve market competitiveness. A high-quality image dataset of 36 fruit and vegetable categories from Kaggle is used in this study. The images in the dataset have been preprocessed to ensure that the data is suitable for the classification task and sets the stage for efficient training and evaluation of the model. Logistic regression was first used as the baseline model in order to compare the performance with the Support Vector Machine (SVM) model. Subsequently, hyperparameter tuning is performed to optimize the model to achieve the best cross-validation accuracy. Next, the SVM model is trained with the selected hyperparameters, and the training time is recorded. The performance of the model was evaluated in detail by confusion matrix and classification reports, and the test set was used for final validation to ensure that the model would also perform well on unseen data. The SVM model achieves an accuracy of 96% on both validation and test sets, which is a very good performance. The hyperparameters optimized by GridSearchCV (C = 10, gamma = 0.1, kernel = rbf) effectively improve the performance of the model, verifying that reasonable hyperparameter selection is crucial to the SVM model. These results show that the model has good generalization ability and potential to be applied to real-time classification tasks.
Read full abstract