Abstract

Given a large dataset of images, we seek to automatically determine the visually similar object and scene classes together with their image segmentation. To achieve this we combine two ideas: (i) that a set of segmented objects can be partitioned into visual object classes using topic discovery models from statistical text analysis; and (ii) that visual object classes can be used to assess the accuracy of a segmentation. To tie these ideas together we compute multiple segmentations of each image and then: (i) learn the object classes; and (ii) choose the correct segmentations. We demonstrate that such an algorithm succeeds in automatically discovering many familiar objects in a variety of image datasets, including those from Caltech, MSRC and LabelMe.

Highlights

  • In [21] we posed the question, given a (Gargantuan) number of images, “Is it possible to learn visual object classes from looking at images?”

  • Images are treated as documents, with each image being represented by a histogram of visual words

  • In this paper we propose to use image segmentation as a way to utilize visual grouping cues to produce groups of related visual words

Read more

Summary

Introduction

In [21] we posed the question, given a (Gargantuan) number of images, “Is it possible to learn visual object classes from looking at images?”. Some success has been reported in discovering object and scene categories [7, 17, 21] by borrowing tools from the statistical text analysis community These tools, such as probabilistic Latent Semantic Analysis (pLSA) [12] and Latent Dirichlet Allocation (LDA) [2], use unordered “bag of words” representation of documents to automatically discover topics in a large text corpus. To map these techniques onto the visual domain, an equivalent notion of a text word needs to be defined. Applying topic discovery to such a representation is successful in classifying the image, but the resulting object segmentations are “soft” – the discovered objects (or scenes) are shown by highlighting the visual

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.