Abstract

In image classification, few-shot learning deals with recognizing visual categories from a few tagged examples. The degree of expressiveness of the encoded features in this scenario is a crucial question that needs to be addressed in the models being trained. Recent approaches have achieved encouraging results in improving few-shot models in deep learning, but designing a competitive and simple architecture is challenging, especially considering its requirement in many practical applications. This work proposes an improved few-shot model based on a multi-layer feature fusion (FMLF) method. The presented approach includes extended feature extraction and fusion mechanisms in the Convolutional Neural Network (CNN) backbone, as well as an effective metric to compute the divergences in the end. In order to evaluate the proposed method, a challenging visual classification problem, maize crop insect classification with specific pests and beneficial categories, is addressed, serving both as a test of our model and as a means to propose a novel dataset. Experiments were carried out to compare the results with ResNet50, VGG16, and MobileNetv2, used as feature extraction backbones, and the FMLF method demonstrated higher accuracy with fewer parameters. The proposed FMLF method improved accuracy scores by up to 3.62% in one-shot and 2.82% in five-shot classification tasks compared to a traditional backbone, which uses only global image features.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call