The assessment of food and industrial crops during harvesting is important to determine the quality and downstream processing requirements, which in turn affect their market value. While machine learning models have been developed for this purpose, their deployment is hindered by the high cost of labelling the crop images to provide data for model training. This study examines the capabilities of semi-supervised and active learning to minimise effort when labelling cotton lint samples while maintaining high classification accuracy. Random forest classification models were developed using supervised learning, semi-supervised learning, and active learning to determine Egyptian cotton grade. Compared to supervised learning (80.20–82.66%) and semi-supervised learning (81.39–85.26%), active learning models were able to achieve higher accuracy (82.85–85.33%) with up to 46.4% reduction in the volume of labelled data required. The primary obstacle when using machine learning for Egyptian cotton grading is the time required for labelling cotton lint samples. However, by applying active learning, this study successfully decreased the time needed from 422.5 to 177.5 min. The findings of this study demonstrate that active learning is a promising approach for developing accurate and efficient machine learning models for grading food and industrial crops.
Read full abstract