CLUBS: An RGB-D dataset with cluttered box scenes containing household objects

Tonci Novkovic,Roland Siegwart,Marko Panjek,Margarita Grinvald,Juan Nieto,Fadri Furrer

doi:10.1177/0278364919875221

Abstract

With the progress of machine learning, the demand for realistic data with high-quality annotations has been thriving. In order to generalize well, considerable amounts of data are required, especially realistic ground-truth data, for tasks such as object detection and scene segmentation. Such data can be difficult, time-consuming, and expensive to collect. This article presents a dataset of household objects and box scenes commonly found in warehouse environments. The dataset was obtained using a robotic setup with four different cameras. It contains reconstructed objects and scenes, as well as raw RGB and depth images, camera poses, pixel-wise labels of objects directly in the RGB images, and 3D bounding boxes with poses in the world frame. Furthermore, raw calibration data are provided, together with the intrinsic and extrinsic parameters for all the sensors. By providing object labels as pixel-wise masks, 3D, and 2D object bounding boxes, this dataset is useful for both object recognition and instance segmentation. The realistic scenes provided will serve for learning-based algorithms applied to scenarios where boxes of objects are often found, such as in the logistics sector. Both the dataset and the tools for data processing are published and available online.

Full Text