Extracting Effective Image Attributes with Refined Universal Detection.

Qiang Yu,Chunxia Zhang,Lifei Song,Chunhong Pan,Xinyu Xiao

doi:10.3390/s21010095

Abstract

Recently, image attributes containing high-level semantic information have been widely used in computer vision tasks, including visual recognition and image captioning. Existing attribute extraction methods map visual concepts to the probabilities of frequently-used words by directly using Convolutional Neural Networks (CNNs). Typically, two main problems exist in those methods. First, words of different parts of speech (POSs) are handled in the same way, but non-nominal words can hardly be mapped to visual regions through CNNs only. Second, synonymous nominal words are treated as independent and different words, in which similarities are ignored. In this paper, a novel Refined Universal Detection (RUDet) method is proposed to solve these two problems. Specifically, a Refinement (RF) module is designed to extract refined attributes of non-nominal words based on the attributes of nominal words and visual features. In addition, a Word Tree (WT) module is constructed to integrate synonymous nouns, which ensures that similar words hold similar and more accurate probabilities. Moreover, a Feature Enhancement (FE) module is adopted to enhance the ability to mine different visual concepts in different scales. Experiments conducted on the large-scale Microsoft (MS) COCO dataset illustrate the effectiveness of our proposed method.

Highlights

Attribute extraction is an important process in various computer vision tasks
Quantitative Results: Referring to Reference [3], the metric Average Precision (AP) for multi-label classification problems is used in the evaluation
The precision and recall values are calculated by the number of true positive instances and false positive instances corresponding to different probability thresholds

Summary

Introduction

Attribute extraction is an important process in various computer vision tasks. Attributes with high probabilities of words “man”, “food”, “eating”, and “delicious” indicate that there is probably a man who is eating delicious food in that image. It shows that the attributes containing high-level semantic. The application of attributes is of paramount importance in image captioning, which is a process of generating natural sentence descriptions for a given image based on the objects, together with their actions and relationships in the image. Recent work shows that attributes containing high-level semantic information can significantly improve the performance of caption generation [3,4].

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Sensors (Basel, Switzerland)	Publication Date: Dec 25, 2020
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Extracting Effective Image Attributes with Refined Universal Detection.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)

Lead the way for us

Similar Papers

Feature-aware aggregation network for remote sensing image cloud detection
Xianjun Du ... Hailei Wu
International Journal of Remote Sensing | VOL. 44
Xianjun Du, et. al.Xianjun Du ... Hailei Wu
19 Mar 2023
International Journal of Remote Sensing | VOL. 44

DGFNet: Dual Gate Fusion Network for Land Cover Classification in Very High-Resolution Images
Yongjie Guo ... Hongjian You
Remote Sensing | VOL. 13
Yongjie Guo, et. al.Yongjie Guo ... Hongjian You
19 Sep 2021
Remote Sensing | VOL. 13

Convolutional Neural Network based Age Estimation from Facial Image and Depth Prediction from Single Image

-

01 Jan 2015
01 Jan 2015

PECULIARITIES OF USING DIFFERENT PARTS OF SPEECH IN UKRAINIAN ERGONYMIYA

-

20 Dec 2018
20 Dec 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Extracting Effective Image Attributes with Refined Universal Detection.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)