Partial-duplicate Image Retrieval Research Articles

Existing Masked Image Modeling methods apply fixed mask patterns to guide the self-supervised training. As those mask patterns resort to different criteria to depict image contents, sticking to a fixed pattern leads to a limited vision cues modeling capability. This paper introduces an evolved hierarchical masking method to pursue general visual cues modeling in self-supervised learning. The proposed method leverages the vision model being trained to parse the input visual cues into a hierarchy structure, which is hence adopted to generate masks accordingly. The accuracy of hierarchy is on par with the capability of the model being trained, leading to evolved mask patterns at different training stages. Initially, generated masks focus on low-level visual cues to grasp basic textures, then gradually evolve to depict higher-level cues to reinforce the learning of more complicated object semantics and contexts. Our method does not require extra pre-trained models or annotations and ensures training efficiency by evolving the training difficulty. We conduct extensive experiments on seven downstream tasks including partial-duplicate image retrieval relying on low-level details, as well as image classification and semantic segmentation that require semantic parsing capability. Experimental results demonstrate that it substantially boosts performance across these tasks. For instance, it surpasses the recent MAE by 1.1% in imageNet-1K classification and 1.4% in ADE20K segmentation with the same training epochs. We also align the proposed method with the current research focus on LLMs. The proposed approach bridges the gap with large-scale pre-training on semantic demanding tasks and enhances intricate detail perception in tasks requiring low-level feature recognition.

Recently, most of partial-duplicate image retrieval approaches build on the bag-of-visual-words (BOW) model, in which local image features are quantized into a bag of compact visual words, i.e., BOW representation, for fast image matching. However, due to the quantization errors in visual words, the BOW representation shows low discriminability, which causes negative influence on retrieval accuracy. Encoding contextual clues into the BOW representation is a popular technique to improve its discriminability. Unfortunately, the captured contextual clues are generally not stable and informative enough, resulting in limited discriminability improvement. To address the issues, we propose a multiple contextual clue encoding approach for partial-duplicate image retrieval. By treating each visual word of any given query or database image as a center, we first propose an asymmetrical context selection strategy to select the contextual visual words for the query and database images differently. Then, we capture the multiple contextual clues: the geometric relationships, the visual relationships, and the spatial configurations between the center and its contextual visual words. These captured multiple contextual clues are compressed to generate the multi-contextual descriptors, which are further integrated with the center visual word to improve the discriminability of BOW representation. Experiments conducted on the large-scale partial-duplicate image dataset demonstrate that the proposed approach provides higher retrieval accuracy than the state-of-the-arts, while achieves comparable performances in time and space efficiency.

Partial-duplicate Image Retrieval Research Articles

Articles published on Partial-duplicate Image Retrieval

Evolved Hierarchical Masking for Self-Supervised Learning.

Partial-duplicate image retrieval using spatial and visual contextual clues

Partial-duplicate image retrieval based on HSV colour space for coverless information hiding

Visual vocabulary tree-based partial-duplicate image retrieval for coverless image steganography

Visual vocabulary tree-based partial-duplicate image retrieval for coverless image steganography

Partial-duplicate image retrieval based on HSV colour space for coverless information hiding

Encoding multiple contextual clues for partial-duplicate image retrieval

Fusing multi-cues description for partial-duplicate image retrieval

Cross-Indexing of Binary SIFT Codes for Large-Scale Image Search

Robust Spatial Consistency Graph Model for Partial Duplicate Image Retrieval

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Partial-duplicate Image Retrieval Research Articles

Articles published on Partial-duplicate Image Retrieval

Evolved Hierarchical Masking for Self-Supervised Learning.

Partial-duplicate image retrieval using spatial and visual contextual clues

Partial-duplicate image retrieval based on HSV colour space for coverless information hiding

Visual vocabulary tree-based partial-duplicate image retrieval for coverless image steganography

Visual vocabulary tree-based partial-duplicate image retrieval for coverless image steganography

Partial-duplicate image retrieval based on HSV colour space for coverless information hiding

Encoding multiple contextual clues for partial-duplicate image retrieval

Fusing multi-cues description for partial-duplicate image retrieval

Cross-Indexing of Binary SIFT Codes for Large-Scale Image Search

Robust Spatial Consistency Graph Model for Partial Duplicate Image Retrieval