Abstract
We present MOPED, a framework for Multiple Object Pose Estimation and Detection that seamlessly integrates single-image and multi-image object recognition and pose estimation in one optimized, robust, and scalable framework. We address two main challenges in computer vision for robotics: robust performance in complex scenes, and low latency for real-time operation. We achieve robust performance with Iterative Clustering Estimation (ICE), a novel algorithm that iteratively combines feature clustering with robust pose estimation. Feature clustering quickly partitions the scene and produces object hypotheses. The hypotheses are used to further refine the feature clusters, and the two steps iterate until convergence. ICE is easy to parallelize, and easily integrates single- and multi-camera object recognition and pose estimation. We also introduce a novel object hypothesis scoring function based on M-estimator theory, and a novel pose clustering algorithm that robustly handles recognition outliers. We achieve scalability and low latency with an improved feature matching algorithm for large databases, a GPU/CPU hybrid architecture that exploits parallelism at all levels, and an optimized resource scheduler. We provide extensive experimental results demonstrating state-of-the-art performance in terms of recognition, scalability, and latency in real-world robotic applications.
Highlights
The task of estimating the pose of a rigid object model from a single image is a well studied problem in the literature
We have presented and validated MOPED, an optimized framework for the recognition and registration of objects that addresses the problems of high scene complexity, scalability and latency that hamper object recognition systems when working in real-world scenes
The multiple architectural improvements in MOPED provide over 30x improvement in latency and throughput, allowing MOPED to perform in real-time robotic applications
Summary
The task of estimating the pose of a rigid object model from a single image is a well studied problem in the literature. Related is the issue of repeated objects: the matching ambiguity introduced by repeated instances of an object presents an enormous challenge for robust estimators, as the matched features might belong to different object instances despite being correct Solutions such as grouping (Lowe, 1987), interpretation trees (Grimson, 1991) or image space clustering (Collet et al, 2009) are often used, but false positives often arise from algorithms not being able to handle unexpected scene complexity. A novel scheduling scheme enables the efficient use of symmetric multiprocessing(SMP) architectures, utilizing all available cores on modern multi-core CPUs. Our contributions are validated through extensive experimental results demonstrating state-of-the-art performance in terms of recognition, pose estimation accuracy, scalability, throughput and latency. Additional information, videos, and the full source code of MOPED are available online at http://personalrobotics. intel-research.net/projects/moped
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have