Bridging the Robot Perception Gap with Mid-Level Vision

Chi Li,Jonathan Bohren,Gregory D Hager

doi:10.1007/978-3-319-60916-4_1

Abstract

The practical application of machine perception to support physical manipulation in unstructured environments remains a barrier to the development of intelligent robotic systems. Recently, great progress has been made by the large-scale machine perception community, but these methods have made few contributions to the applied robotic perception. This is in part because such large-scale systems are designed to recognize category labels of large numbers of objects from a single image, rather than highly accurate, efficient, and robust pose estimation in environments for which a robot has reliable prior knowledge. In this paper, we illustrate the potential for synergistic integration of modern computer vision methods into robotics by augmenting a RANSAC-based registration method with a state-of-the art semantic segmentation algorithm. We detail a convolutional architecture for semantic labeling of the scene, modified to operate efficiently using integral images. We combine this labeling with two novel scene parsing variants of RANSAC, and show, on a new RGB-D dataset that contains complex configurations of textureless and highly specular objects, that our method demonstrates improved performance of pose estimation over the unaugmented algorithms.

Full Text