Abstract

Previous Multi-label image classification is largely limited by the representation power of the hand-crafted features. The convolutional neural network (CNN) has achieved successes in many computer vision tasks. In this work, we adapt the CNN to the multi-label image classification, where three approaches are used including end-to-end training on the target dataset, pre-training on Image Net and fine-tuning on the target dataset, CNN features extracted from Image Net for the AdaBoost.MH classifier. The experimental results on two datasets show that CNN model can boost large margin on the object dataset in contrast with the hand-crafted features methods, which achieves at most 98% on MSRC compared to 92% by the state-of-art algorithms, but benefit a little in the scene dataset. The primary discrepancies between the object and the scene classification tasks lies in that the former need to only focus on the foreground part of the image but the latter requires paying attention to the entire image. Source code upon Torch7 toolkit to reproduce the experiments in the paper is made publicly available.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call