Abstract

The explosion of images on the Web has led to a number of efforts to organize images semantically and compile collections of visual knowledge. While there has been enormous progress on categorizing entire images or bounding boxes, only few studies have targeted fine-grained image understanding at the level of specific shape contours. For example, given an image of a cat, we would like a system to not merely recognize the existence of a cat, but also to distinguish between the cat’s legs, head, tail, and so on. In this paper, we present ShapeLearner, a system that acquires such visual knowledge about object shapes and their parts. ShapeLearner jointly learns this knowledge from sets of segmented images. The space of label and segmentation hypotheses is pruned and then evaluated using Integer Linear Programming. ShapeLearner places the resulting knowledge in a semantic taxonomy based on WordNet and is able to exploit this hierarchy in order to analyze new kinds of objects that it has not observed before. We conduct experiments using a variety of shape classes from several representative categories and demonstrate the accuracy and robustness of our method.

Highlights

  • Motivation Over the last decade, we have observed an explosion in the number of images that are uploaded to the Internet

  • Current object recognition systems mostly operate at the coarse-grained level of entire images or of rectangular image bounding boxes, while segmentation algorithms tend to consider abstract distinctions such as between foreground and background

  • We consider the level of image understanding, aiming at a more fine-grained understanding of images by automatically identifying specific shape contours and the parts of objects that they portray

Read more

Summary

INTRODUCTION

Motivation Over the last decade, we have observed an explosion in the number of images that are uploaded to the Internet Sharing platforms such as Flickr and Facebook have long been driving forces in turning previously undistributed digital images into an abundant resource with tens of billions of images. Given a relatively small object part, humans can recognize the object when the part is sufficiently unique (Binford, 1971, Biederman, 1987) Such finer-grained image understanding has remained an open problem in computing, as it requires considerable background knowledge about the objects. The core operation consists of a joint shape classification, segmentation, and annotation procedure To solve this challenging central task, ShapeLearner automatically transfers visual knowledge of seen shapes to unseen images, accounting for both geometric and semantic similarity. This hierarchical organization is critical when jointly analyzing families of objects, due to the high degree of geometric variability of shapes at different levels of granularity

Image Knowledge Harvesting
Segmentations and Semantic Relationships
High-Level Perspective
ShapeLearner’s Knowledge
SHAPE ANALYSIS
Shape Segmentation Hypotheses
Shape Inference
Shape-Class and Part-Label Hypotheses
Baselines
RESULTS
Labeling Accuracy
Evaluation and Comparison
Method
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.