Disambiguating multi-modal scene representations using perceptual grouping constraints.

Nicolas Pugeault,Florentin Wörgötter,Norbert Krüger

doi:10.1371/journal.pone.0010663

Abstract

In its early stages, the visual system suffers from a lot of ambiguity and noise that severely limits the performance of early vision algorithms. This article presents feedback mechanisms between early visual processes, such as perceptual grouping, stereopsis and depth reconstruction, that allow the system to reduce this ambiguity and improve early representation of visual information. In the first part, the article proposes a local perceptual grouping algorithm that — in addition to commonly used geometric information — makes use of a novel multi–modal measure between local edge/line features. The grouping information is then used to: 1) disambiguate stereopsis by enforcing that stereo matches preserve groups; and 2) correct the reconstruction error due to the image pixel sampling using a linear interpolation over the groups. The integration of mutual feedback between early vision processes is shown to reduce considerably ambiguity and noise without the need for global constraints.

Highlights

Both human and machine perception involve a progressive abstraction of visual information, from the raw signal provided by the eyes or the cameras towards symbolic, object–centric representations [1]
A large amount of work on signal processing and invariant feature descriptors [3] lead to significant progress for tasks like navigation [4] and object recognition [5]
The contributions in this paper are threefold: first we propose a local perceptual grouping mechanism making full use of the multi– modal and semantic information carried by the visual primitives; second, we propose a stereo matching scheme for primitives, allowing for the reconstruction of the 3D equivalent of 2D primitives; third, we investigate how perceptual grouping reduces ambiguities in the reconstructed 3D representation

Summary

Introduction

Both human and machine perception involve a progressive abstraction of visual information, from the raw signal provided by the eyes or the cameras towards symbolic, object–centric representations [1]. One notable attempt by Nevatia and colleagues [6,7], makes use of a feature hierarchy for stereo reconstruction Another notable class of systems is the model–based vision, where a large amount of world knowledge is available and is used to disambiguate and interpret the visual signal. One problem with the latter approach is that the large amount of ambiguity and noise present in images can lead an early extraction of symbolic features to fail, failures which are difficult to correct. The use of sophisticated models in vision introduces more bias in the system, whereas signal based approaches lead to more variance

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PloS one	Publication Date: Jun 9, 2010
Citations: 56	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Disambiguating multi-modal scene representations using perceptual grouping constraints.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one

Lead the way for us

Similar Papers

Image tracking in real-time: a transputer emulation of some early mammalian vision processes
Ph Welch ... Dc Wood
Image and Vision Computing | VOL. 11
Ph Welch, et. al.Ph Welch ... Dc Wood
01 May 1993
Image and Vision Computing | VOL. 11

An Image-Based Model for Early Visual Processing
Heiko Schott ... Felix Wichmann
Journal of Vision | VOL. 16
Heiko Schott, et. al.Heiko Schott ... Felix Wichmann
09 May 2016
Journal of Vision | VOL. 16

A smart buffer for tracking using motion data
J.J Little ... J Kam
-
J.J Little, et. al.J.J Little ... J Kam
15 Dec 1993
15 Dec 1993

Pre-cueing, the Epistemic Role of Early Vision, and the Cognitive Impenetrability of Early Vision.
Athanassios Raftopoulos
Frontiers in Psychology | VOL. 8
Athanassios RaftopoulosAthanassios Raftopoulos
10 Jul 2017
Frontiers in Psychology | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Disambiguating multi-modal scene representations using perceptual grouping constraints.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one