Abstract

Due to the great success of convolutional neural networks (CNNs) in the area of computer vision, the existing methods tend to match the global or local CNN features between images for near-duplicate image detection. However, global CNN features are not robust enough to combat background clutter and partial occlusion, while local CNN features lead to high computational complexity in the step of feature matching. To achieve high efficiency while maintaining good accuracy, we propose a coarse-to-fine feature matching scheme using both global and local CNN features for real-time near-duplicate image detection. In the coarse matching stage, we implement the sum-pooling operation on convolutional feature maps (CFMs) to generate the global CNN features, and match these global CNN features between a given query image and database images to efficiently filter most of irrelevant images of the query. In the fine matching stage, the local CNN features are extracted by using maximum values of the CFMs and the saliency map generated by the graph-based visual saliency detection (GBVS) algorithm. These local CNN features are then matched between images to detect the near-duplicate versions of the query. Experimental results demonstrate that our proposed method not only achieves a real-time detection, but also provides higher accuracy than the state-of-the-art methods.

Highlights

  • With the rapid development of Internet technology and the increasing popularity of mobile devices, it is very easy for users to capture, transmit and share images through the networks

  • In order to exploit the advantages of both global features and local features, we propose a coarse-to-fine feature matching scheme using both global and local convolutional neural networks (CNNs) features for near-duplicate image detection

  • In the fine matching stage, we extract and match the local CNN features between images to find the near-duplicate versions of the query

Read more

Summary

Introduction

With the rapid development of Internet technology and the increasing popularity of mobile devices, it is very easy for users to capture, transmit and share images through the networks. In the fine matching stage, we extract and match the local CNN features between images to find the near-duplicate versions of the query. The proposed coarse-to-fine feature matching scheme allows a real-time and accurate near-duplicate image detection. It has important significance in practical applications of content-based image detection/retrieval. In the tasks of facial expression recognition and image classification, the introduction of attention mechanisms leads to the significant improvements [48,49,50,51] Motivated by these works, after the global CNN feature matching, we detect the saliency map by the graph-based visual saliency detection (GBVS) algorithm [52] and extract the local CNN features from the local regions surrounding the maximum values of the saliency map.

Related Works
The Proposed Method
CFM Generation
Coarse Matching Stage
The Extraction of Global CNN Feature
Global Feature
Fine Matching Stage
Central Cropping
Local Region Detection
Local Features Extraction and Matching
Experiments
Datasets and Evaluation Criteria
Parameter Determination
It is clear the detection performance
Performance When Using Different Pre-Trained Networks
Performances
Performance Comparison
Methods
10. Several
Findings
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.