Abstract

In this paper, we focus on the problem of object classification and counting in natural scenes. The existing methods of instance detection can be utilized to classify and count. These works depend on the bounding box annotations, which requires time-consuming human labeling effort. Meanwhile, traditional methods of object counting are mostly based on the class-insensitive density map, therefore they are only able to count one specific category. We propose a unified framework for object classification and counting to overcome these drawbacks of existing works. We employ a two-stream convolutional neural network (CNN), which consists of classification and counting branches. Considering the correlation between classification and counting tasks, we propose a novel correlation loss function to coordinate representations learned by both two branches of network. Without hand-labeled bounding box annotations as supervision information, our end-to-end model can recognize the category of objects in the image and count object numbers per category. The proposed method is evaluated on PASCAL VOC 2007 dataset in multi-label classification and object counting tasks. We have achieved improved performance on both two tasks, which shows the effectiveness of our proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call