Abstract

Convolutional neural networks (CNNs) have achieved great success in visual recognition because of the availability of large-scale image datasets, such as the ImageNet. The transfer of convolutional features to challenging scene recognition remains an open problem. Multiple non-linear transforms endow the convolutional features with abundant information. On the other side, CNNs are adept at capturing the holistic appearances of scenes, whereas the lack of some critical local details may reduce the recognition accuracy. To address these problems, we propose a novel hierarchical coding algorithm to learn effective representations. To adapt the scale variations, many useful patches with various scales sampled from the whole image are considered to provide the sufficient details. Non-negative sparse decomposition model (NNSD) based on convolutional features is proposed to learn the sharable components for each scale and further produce global signatures. Based on the global signatures, inter-class linear coding (ICLC) is proposed to learn the discriminative components and ultimate image representations. Experimental results indicate that our approach significantly improves the recognition accuracy compared with general CNN models and achieves excellent performance on five standard benchmarks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.