Abstract

3D object detection and recognition is increasingly used for manipulation and navigation tasks in service robots. It involves segmenting the objects present in a scene, estimating a feature descriptor for the object view and, finally, recognizing the object view by comparing it to the known object categories. This paper presents an efficient approach capable of learning and recognizing object categories in an interactive and open-ended manner. In this paper, “open-ended” implies that the set of object categories to be learned is not known in advance. The training instances are extracted from on-line experiences of a robot, and thus become gradually available over time, rather than at the beginning of the learning process. This paper focuses on two state-of-the-art questions: (1) How to automatically detect, conceptualize and recognize objects in 3D scenes in an open-ended manner? (2) How to acquire and use high-level knowledge obtained from the interaction with human users, namely when they provide category labels, in order to improve the system performance? This approach starts with a pre-processing step to remove irrelevant data and prepare a suitable point cloud for the subsequent processing. Clustering is then applied to detect object candidates, and object views are described based on a 3D shape descriptor called spin-image. Finally, a nearest-neighbor classification rule is used to predict the categories of the detected objects. A leave-one-out cross validation algorithm is used to compute precision and recall, in a classical off-line evaluation setting, for different system parameters. Also, an on-line evaluation protocol is used to assess the performance of the system in an open-ended setting. Results show that the proposed system is able to interact with human users, learning new object categories continuously over time.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call