Abstract

Deep convolutional neural networks (CNN) have demonstrated a very powerful approach for extracting discriminative local descriptors for image description. Many related works suggest that an effective aggregation representation for deep convolutional features is particularly important in forming robust and compact image representations. In this paper, a new robust global descriptor for image retrieval is proposed by creating an effective method for aggregating local deep convolutional features weighted by regional significance and channel sensitivity through sum-pooling on multiple regions. The proposed aggregation method effectively takes advantage of multiple scales, and considers both the varied significance of regional visual content and the sparsity and intensity of response values in the channel. This can improve the ability of deep features to be described and discerned. The experimental results on six benchmark datasets demonstrate that our method can achieve retrieval results comparable to those of some popular approaches for deep feature aggregation but without fine-tuning strategies and multiple image inputs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.