Abstract

Zero-shot recognition (ZSR) aims to perform visual classification by category in the absence of training samples. The focus in most traditional ZSR models is using semantic knowledge about familiar categories to represent unfamiliar categories with only the visual appearance of an unseen object. In this research, we consider not only visual information but context to enhance the classifier’s cognitive ability in a multi-object scene. We propose a novel method, contextual inference, that uses external resources such as knowledge graphs and semantic embedding spaces to obtain similarity measures between an unseen object and its surrounding objects. Using the intuition that close contexts involve more related associations than distant ones, distance weighting is applied to each piece of surrounding information with a newly defined distance calculation formula. We integrated contextual inference into traditional ZSR models to calibrate their visual predictions, and performed extensive experiments on two different datasets for comparative evaluations. The experimental results demonstrate the effectiveness of our method through significant enhancements in performance.

Highlights

  • Demands to expand the scale of categories available for object recognition have been aroused by a rapid increase in the sizes and types of image data and the recent success of large-scale recognition systems [1]

  • A human can recognize an unseen zebra based on visual experiences with a horse and a watermelon, if it is known that a zebra looks like a horse with stripes on its body.In the same way, the objective of zero-shot learning (ZSL) methods is to increase the cognitive capability of a visual classifier by using annotated training sets of seen class labels and external knowledge about the semantic relations between seen and unseen categories to allow the classifier to infer the class labels of novel objects

  • The proposed method offers fair performance improvement in classifying an appropriate category compared with the existing method, and it more often gives categories related to the category type of the target object a high ranking, as shown by the first experimental results

Read more

Summary

Introduction

Demands to expand the scale of categories available for object recognition have been aroused by a rapid increase in the sizes and types of image data and the recent success of large-scale recognition systems [1]. A human can recognize an unseen zebra based on visual experiences with a horse and a watermelon, if it is known that a zebra looks like a horse with stripes on its body.In the same way, the objective of ZSL methods is to increase the cognitive capability of a visual classifier by using annotated training sets of seen class labels and external knowledge about the semantic relations between seen and unseen categories to allow the classifier to infer the class labels of novel objects. Most existing ZSL methods focus on learning to recognize inherent visual features (e.g., color, shape, and texture) and providing a map between the visual and semantic representations

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.