Automatic Image Annotation by Sequentially Learning From Multi-Level Semantic Neighborhoods

Houjie Li,Wei Li,Mingxiao Zheng,Xin He,Haiyu Song,Hongda Zhang

doi:10.1109/access.2021.3117349

Abstract

Automatic image annotation is a key technology in image understanding and pattern recognition, and is becoming increasingly important in order to annotate large-scale images. In the past decade, the nearest neighbor model-based AIA (Automatic image annotation) methods have been proved to be the most successful in all classical models. This model has four major challenges including semantic gap, label-imbalance, wider range labels, and weak-labeling. In this paper, we propose a novel annotation model based on three-pass KNN (k-Nearest Neighbor) to address the aforementioned challenges. The key idea is to identify appropriate neighbors at each pass KNN. In the first pass KNN, we identify the several most relevant categories based on label feature rather than visual feature as traditional models. In the second pass KNN, we determine the relevant images based on multi-modal (visual and textual label) embedding features. As the test image has not been annotated with any label, we propose a pre-annotation strategory before image annotation to improve the semantic level. In the third pass KNN, we capture relevant labels from semantically and visually similar images and propagate them to the given unlabeled image. In contrast with traditional nearest neighbor based methods, our method can inherently alleviate the problems of semantic gap, label-imbalance, and wider range labels. In addition, to alleviate the issue of weak-labeling, we propose label refinement for training images. Extensive experiments on three classical benchmark datasets and MS-COCO demonstrate that the proposed method significantly outperforms the state-of-the-art in terms of per-label and per-image metrics.

Highlights

With the prevalence of digital photography and social networks in our daily lives, billions of images are generated and shared on the Internet
Significant advances have been achieved on largescale image recognition tasks [8], with deep learning models such as Convolutional Neural Network (CNN) and Generative Adversarial Network (GAN)
To resolve the problems of the weak-labeling and the labelimbalance, we propose a novel image annotation method based on nearest neighbors

Summary

INTRODUCTION

With the prevalence of digital photography and social networks in our daily lives, billions of images are generated and shared on the Internet. To resolve the problems of the weak-labeling and the labelimbalance, we propose a novel image annotation method based on nearest neighbors. Rather than in traditional visual feature space, our proposed method refines labels for all training images in the label feature space, which can inherently address the problem of the semantic gap. Our proposed method maps visual feature vectors extracted by deep learning architecture (pre-trained VGG-16), and refines label vectors to a common feature space by the KCCA model. B. LABEL REFINEMENT To alleviate the shortcoming of the weak-labeling, most methods devise sophisticated models with expensive time and space cost in annotation process.Tang proposed a novel tri-clustered tensor completion framework to collaboratively explore these three kinds of information to improve the performance of social image tag refinement [32].Tang proposed a novel Social anchor-Unit GrAph Regularized Tensor Completion (SUGAR-TC) method to efficiently refine the tags of social images, which is insensitive to the scale of data [33]. A category (k) is defined as the mean of all images’ label features in this category, denoted as:

FEATURE EXTRACTION AND REPRESENTATION

LABEL PROPAGATION BASED ON MULTI-LEVEL SEMANTIC NEIGHBORHOODS

EVALUATION METRICS

IMPLEMENTATION DETAILS

Findings

CONCLUSION AND FUTURE WORK

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2021
Citations: 4	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Automatic Image Annotation by Sequentially Learning From Multi-Level Semantic Neighborhoods

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Image Retrieval: Modelling Keywords via Low-level Features
Zenonas Theodosiou
ELCVIA Electronic Letters on Computer Vision and Image Analysis | VOL. 14
Zenonas TheodosiouZenonas Theodosiou
21 Dec 2015
ELCVIA Electronic Letters on Computer Vision and Image Analysis | VOL. 14

A Multi-feature Fusion Method for Automatic Multi-label Image Annotation with Weighted Histogram Integral and Closure Regions Counting
Sen Xia ... Bing Wang
-
Sen Xia, et. al.Sen Xia ... Bing Wang
01 Jan 2015
01 Jan 2015

A survey and analysis on automatic image annotation
Qimin Cheng ... Sen Li
Pattern Recognition | VOL. 79
Qimin Cheng, et. al.Qimin Cheng ... Sen Li
13 Feb 2018
Pattern Recognition | VOL. 79

Automatic image annotation method based on Gaussian mixture model
Na Chen
Journal of Computer Applications | VOL. 30
Na ChenNa Chen
14 Dec 2010
Journal of Computer Applications | VOL. 30

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Automatic Image Annotation by Sequentially Learning From Multi-Level Semantic Neighborhoods

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access