Abstract

Abstract. Nowadays, object detection is considered as an unavoidable aspect that needs to be addressed in any robotic application, especially in industrial settings where robots and vehicles interact closely with humans and objects and therefore a high level of safety for workers and machines is required. This paper proposes an object detection framework suitable for automated vehicles in the factory of the future. It utilizes only point cloud information captured by LiDAR sensors. The system divides the point cloud into voxels and learns features from the calculated local patches. The aggregated feature samples are then used to iteratively train a classifier to recognize object classes. The framework is evaluated using a new synthetic 3D LiDAR dataset of objects that simulates large indoor point cloud scans of a factory model. It is also compared with other methods by evaluating on SUN RGB-D benchmark dataset. The evaluations reveal that the framework can achieve promising object recognition and detection results that we report as a baseline.

Highlights

  • Interpretation of point cloud data is considered as an inevitable step in the development of a perceptual component of most of the recent robotic applications

  • Such methods either extend 2D RGB detectors from images to detect objects in 3D or generate 2D images from point clouds in order to feed them to the detection network

  • VoteNet (Qi et al, 2019) is current state-ofthe-art method that uses Hough voting for 3D object detection using only point cloud information

Read more

Summary

INTRODUCTION

Interpretation of point cloud data is considered as an inevitable step in the development of a perceptual component of most of the recent robotic applications. A straightforward way for object detection is to represent the whole point cloud with such architectures (similar to the 2D detectors) and produce object proposals directly from the learned features These approaches work in a two-dimensional case since the object center is a visible pixel in the image. Some methods such as Hough VoteNet (Qi et al, 2019) propose a voting mechanism that generates points to estimate the object centers which are later will be aggregated to produce object proposals This method is efficient and well-adapted for indoor scenarios, where there are lots of occlusions and methods based on bird’s eye view will fail. The double-blind peer-review was conducted on the basis of the full paper

RELATED WORK
Learning Point Cloud Representations for Detection and Segmentation
PROPOSED FRAMEWORK
Pre-processing
Feature Extraction
Segmentation into Local Patches
Feature Aggregation
Classification
Post-processing
Synthetic Dataset
SUN RGB-D
EXPERIMENTS
Method
Findings
CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.