Abstract

With the development of deep learning technique, the intelligent pattern recognition studies mainly focus on the optimization of models but ignore the impact of data quality for a long time. From our viewpoint, data quality is very important to the sustainability of intelligent applications, in terms of task performance, power consumption and training time. In this paper, we proposed a novel data quality assessment method, called k-nearest neighbors distance entropy (KNN-DE), to evaluate the crop pest images and screen high informative data to complete the data-efficient pest recognition task. Many comparative experiments were carried out, and the results show that the high informative data selected by the proposed KNN-DE method outperform other related works. In specific, under the data budget of 100 samples per category, the performance of the model trained with high informative data is 17 % higher than that trained with low informative data. In addition, by referring to the model accuracy trained with 100 low informative samples per class, the alternative selection of high informative data can realize the improvement of data efficiency and training time efficiency by 2.5 times. Thus, this paper provides a data-centric research perspective, lays a foundation for data quality assessment, and makes a positive attempt to explore the data-efficient sustainable learning.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.