Abstract

Scene classification is increasingly popular due to its extensive usage in many real-world applications such as object detection, image retrieval, and so on. Traditionally, the low-level hand-crafted image representations are adopted to describe the scene images. However, they usually fail to detect semantic features of visual concepts, especially in handling complex scenes. In this paper, we propose a novel high-level image representation which utilizes image attributes as features for scene classification. More specifically, the attributes of each image are firstly extracted by a deep convolution neural network (CNN), which is trained to be a multi-label classifier by minimizing an element-wise logistic loss function. The process of generating attributes can reduce the “semantic gap” between the low-level feature representation and the high level scene meaning. Based on the attributes, we then build a system to discover semantically meaningful descriptions of the scene classes. Extensive experiments on four large-scale scene classification datasets show that our proposed algorithm considerably outperforms other state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call