The MOPED framework: Object recognition and pose estimation for manipulation

Alvaro Collet,Siddhartha S Srinivasa,Manuel Martinez

doi:10.1177/0278364911401765

Alvaro Collet, Siddhartha S Srinivasa + Show 1 more

Open Access

https://doi.org/10.1177/0278364911401765

Copy DOI

Abstract

We present MOPED, a framework for Multiple Object Pose Estimation and Detection that seamlessly integrates single-image and multi-image object recognition and pose estimation in one optimized, robust, and scalable framework. We address two main challenges in computer vision for robotics: robust performance in complex scenes, and low latency for real-time operation. We achieve robust performance with Iterative Clustering Estimation (ICE), a novel algorithm that iteratively combines feature clustering with robust pose estimation. Feature clustering quickly partitions the scene and produces object hypotheses. The hypotheses are used to further refine the feature clusters, and the two steps iterate until convergence. ICE is easy to parallelize, and easily integrates single- and multi-camera object recognition and pose estimation. We also introduce a novel object hypothesis scoring function based on M-estimator theory, and a novel pose clustering algorithm that robustly handles recognition outliers. We achieve scalability and low latency with an improved feature matching algorithm for large databases, a GPU/CPU hybrid architecture that exploits parallelism at all levels, and an optimized resource scheduler. We provide extensive experimental results demonstrating state-of-the-art performance in terms of recognition, scalability, and latency in real-world robotic applications.

Highlights

The task of estimating the pose of a rigid object model from a single image is a well studied problem in the literature
We have presented and validated MOPED, an optimized framework for the recognition and registration of objects that addresses the problems of high scene complexity, scalability and latency that hamper object recognition systems when working in real-world scenes
The multiple architectural improvements in MOPED provide over 30x improvement in latency and throughput, allowing MOPED to perform in real-time robotic applications

Summary

Introduction

The task of estimating the pose of a rigid object model from a single image is a well studied problem in the literature. Related is the issue of repeated objects: the matching ambiguity introduced by repeated instances of an object presents an enormous challenge for robust estimators, as the matched features might belong to different object instances despite being correct Solutions such as grouping (Lowe, 1987), interpretation trees (Grimson, 1991) or image space clustering (Collet et al, 2009) are often used, but false positives often arise from algorithms not being able to handle unexpected scene complexity. A novel scheduling scheme enables the efficient use of symmetric multiprocessing(SMP) architectures, utilizing all available cores on modern multi-core CPUs. Our contributions are validated through extensive experimental results demonstrating state-of-the-art performance in terms of recognition, pose estimation accuracy, scalability, throughput and latency. Additional information, videos, and the full source code of MOPED are available online at http://personalrobotics. intel-research.net/projects/moped

Input: object models

Iterative Clustering-Estimation

ICE as Expectation-Maximization

The MOPED Framework

Image Space Clustering

Hypothesis Quality Score

Cluster Clustering

Mean Shift clustering on pose space

Projection clustering

Performance Comparison

Benchmarks

Addressing Scalability and Latency

Baseline system

The Rotation Benchmark

The Zoom Benchmark

The Simple Movie Benchmark

The Complex Movie Benchmark

Feature Matching

Brute Force on GPU

Performance comparison

Architecture Optimizations

GPU and Embarrassingly Parallel Problems

Intra-core optimizations

Core 2 Core 4 Core

Symmetric Multiprocessing

Multi-Frame Scheduling

Performance evaluation

Recognition and Accuracy

Generalized Camera

Pose averaging

Pose estimation accuracy

Robustness against modeling noise

Conclusion

Reprojection error

Findings

Backprojection Error

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: The International Journal of Robotics Research	Publication Date: Apr 1, 2011
Citations: 470	License type: cc-by

R Discovery Prime

R Discovery Prime

The MOPED framework: Object recognition and pose estimation for manipulation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: The International Journal of Robotics Research

Lead the way for us

Similar Papers

Lifelong robotic object perception
...
-
, et. al. ...
01 Jan 2012
01 Jan 2012

A Framework for 3D Object Detection and Pose Estimation in Unstructured Environment Using Single Shot Detector and Refined LineMOD Template Matching
Shili Chen ... Jian Li
-
Shili Chen, et. al.Shili Chen ... Jian Li
01 Sep 2019
01 Sep 2019

MORE: simultaneous multi-view 3D object recognition and pose estimation
Tommaso Parisotto ... Hamidreza Kasaei
Intelligent service robotics | VOL. 16
Tommaso Parisotto, et. al.Tommaso Parisotto ... Hamidreza Kasaei
20 Jun 2023
Intelligent service robotics | VOL. 16

Object recognition and pose estimation for robotic manipulation using color cooccurrence histograms
S Ekvall ... D Kragic
-
S Ekvall, et. al.S Ekvall ... D Kragic
03 Dec 2003
03 Dec 2003

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The MOPED framework: Object recognition and pose estimation for manipulation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: The International Journal of Robotics Research