Abstract

Most conventional fine-grained image recognitions are based on a two-stream model of object-level and part-level CNNs, where the part-level CNN is responsible for learning the object-parts and their spatial relationships. To train the part-level CNN, we first need to separate parts from an object. However, there exist sub-level objects with no distinctive and separable parts. In this paper, a multi-scale CNN with a baseline Object-level and multiple Part-level CNNs is proposed for the fine-grained image recognition with no separable object-parts. The basic idea to train different CNNs of the multi-scale CNNs is to adopt different scales in resizing the training images. That is, the training images are resized such that the entire object appears as much as possible for the Object-level CNN, while only a local part of the object is to be included for the Part-level CNN. This scale-specific image resizing approach requires a scale-controllable parameter in the image resizing process. In this paper, a scale-controllable parameter is introduced for the linear-scaling and random-cropping method. Also, a line-based image resizing method with a scale-controllable parameter is employed for the part-level CNNs. The proposed multi-scale CNN is applied to a food image classification, which belongs to a fine-grained classification problem with no separable object-parts. Experimental results on the public food image datasets show that the classification accuracy improves substantially when the predicted scores of the multi-scale CNN are fused together. This reveals that the object-level and part-level CNNs work harmoniously in differentiating subtle differences of the sub-level objects.

Highlights

  • Convolutional neural networks (CNN) with deep layers have contributed significantly to the performance improvement for the object-wise image classification problems

  • Not all domain-specific datasets have localizable and separable common parts. We focus on such a weakly supervised fine-grained image classification problem that has neither object nor part annotations

  • Multi-scale CNN with an Object-level CNN and multiple Part-level CNNs has been proposed for a fine-grained image classification with no explicitly separable object-parts

Read more

Summary

Introduction

Convolutional neural networks (CNN) with deep layers have contributed significantly to the performance improvement for the object-wise image classification problems. This success has encouraged researchers to solve more challenging problems with CNNs, the fine-grained image classification of recognizing the sub-level classes under an upper-level class. The main difficulty in the fine-grained image classification problem comes from the nature of the domain-specific sub-level images, which have large intra-class and small inter-class variances [1], [2]. The domain-specific images often demand the involvement of the expertise for object labeling, which is an expensive task [2]. The crowdsourcing [3] can be an alternative, it often causes a noisy labelling problem.

Objectives
Methods
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call