An Efficient and Effective Model Based on Mean Positive Examples for Social Image Annotation

Haiyu Song,Hailin Lv,Jinxing Yao,Houjie Li,Anqi Fang,Jian Yun,Mingxiao Zheng

doi:10.1109/access.2020.3039625

Abstract

Nowadays, with the rapid growth of imaging and social network, huge volumes of image data are produced and shared on social media. Social image annotation has been an important and challenging task in the fields of computer vision and machine learning, which can facilitate large-scale image retrieval, indexing, and management. The four most challenges of social image annotation are semantic gap, tag refinement, label-imbalance, and annotation efficiency. To address these issues, we propose an efficient and effective annotation method based on the Mean of Positive Examples (MPE) corresponding to each label. First, we refine user-provided noisy tags by our proposed local smoothing process, and consider the refined tags as key features in contrast to the previous methods that consider them as side information, which significantly improves annotation performance. Second, we propose a weighted trans-media similarity measure method that fuses all modality information in identifying proper neighbors, which promotes the semantic level and eases image annotation. Third, our MPE model gives equal importance to all labels, thus, improving the annotation performance of infrequent labels without sacrificing that of frequent labels. Fourth, our MPE model can dramatically decrease space-time overheads, since the time cost of annotating an image is unaffected by the size of the training image dataset, but relying on the size of label vocabulary. Therefore, our proposed method can be applied to real-world large-scale online social image repositories. Extensive experiments on both benchmark datasets demonstrate the effectiveness and efficiency of our MPE model.

Highlights

W ITH the ever-growing popularity of digital photography and social media (e.g. Facebook, YouTube, Twitter, and Wechat), billions of social images associated with user-provided tags are generated and shared on social networks, which has made social image annotation one of the most important problems in the fields of computer vision and machine learning [1]
To tackle the above problems, in this paper, we propose a novel annotation approach based on representative features of labels that are generated by the mean vectors of positive examples in canonical correlation analysis (CCA) space jointly mapped by visual features and textual tag features
CANONICAL CORRELATION ANALYSIS Several models exploit the direct correlations of multi-modal information and embed the visual features and tag features into a common space based on canonical correlation analysis (CCA) or kernel CCA (KCCA), which aims to reduce the semantic gap and to give better annotation performance [6], [34]

Summary

INTRODUCTION

W ITH the ever-growing popularity of digital photography and social media (e.g. Facebook, YouTube, Twitter, and Wechat), billions of social images associated with user-provided tags are generated and shared on social networks, which has made social image annotation one of the most important problems in the fields of computer vision and machine learning [1]. F. CANONICAL CORRELATION ANALYSIS Several models exploit the direct correlations of multi-modal information and embed the visual features and tag features into a common space based on canonical correlation analysis (CCA) or kernel CCA (KCCA), which aims to reduce the semantic gap and to give better annotation performance [6], [34]. CANONICAL CORRELATION ANALYSIS Several models exploit the direct correlations of multi-modal information and embed the visual features and tag features into a common space based on canonical correlation analysis (CCA) or kernel CCA (KCCA), which aims to reduce the semantic gap and to give better annotation performance [6], [34] These approaches demonstrate the benefit of exploiting side information and multi-modal correlations between visual features and labels, but they only rely on ground truth annotations [11]. Where V (I) and V (J) represent the visual feature vectors of image I and image J, respectively, K1 and K2 are transform factors, and T (I) and T (J) represent the tag feature vectors of image I and image J, respectively

TAG REFINEMENT BASED ON LOCAL SMOOTHING

EXPERIMENT

DATASETS

EVALUATION METRICS

EFFECTIVENESS EVALUATION FOR MODEL COMPONENTS

Findings

CONCLUSION AND FUTURE WORK

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 35	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

An Efficient and Effective Model Based on Mean Positive Examples for Social Image Annotation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Featured correspondence topic model for semantic search on social image collections
Nguyen Anh Tu ... Young-Koo Lee
Expert Systems With Applications | VOL. 77
Nguyen Anh Tu, et. al.Nguyen Anh Tu ... Young-Koo Lee
31 Jan 2017
Expert Systems With Applications | VOL. 77

A Weighted Topic Model Learned From Local Semantic Space for Automatic Image Annotation
Haiyu Song ... Gang Wu
IEEE Access | VOL. 8
Haiyu Song, et. al.Haiyu Song ... Gang Wu
01 Jan 2020
IEEE Access | VOL. 8

SIRE
Steven C.H Hoi ... Pengcheng Wu
-
Steven C.H Hoi, et. al.Steven C.H Hoi ... Pengcheng Wu
28 Nov 2011
28 Nov 2011

The social media image
Nadav Hochman
Big Data & Society | VOL. 1
Nadav HochmanNadav Hochman
01 Jul 2014
Big Data & Society | VOL. 1

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Efficient and Effective Model Based on Mean Positive Examples for Social Image Annotation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access