Review: Deep Learning on 3D Point Clouds

Saifullahi Aminu Bello,Jibril Muhmmad Adam,Jonathan Li,Cheng Wang,Shangshu Yu

doi:10.3390/rs12111729

Saifullahi Aminu Bello, Jibril Muhmmad Adam + Show 3 more

Open Access

https://doi.org/10.3390/rs12111729

Copy DOI

Abstract

A point cloud is a set of points defined in a 3D metric space. Point clouds have become one of the most significant data formats for 3D representation and are gaining increased popularity as a result of the increased availability of acquisition devices, as well as seeing increased application in areas such as robotics, autonomous driving, and augmented and virtual reality. Deep learning is now the most powerful tool for data processing in computer vision and is becoming the most preferred technique for tasks such as classification, segmentation, and detection. While deep learning techniques are mainly applied to data with a structured grid, the point cloud, on the other hand, is unstructured. The unstructuredness of point clouds makes the use of deep learning for its direct processing very challenging. This paper contains a review of the recent state-of-the-art deep learning techniques, mainly focusing on raw point cloud data. The initial work on deep learning directly with raw point cloud data did not model local regions; therefore, subsequent approaches model local regions through sampling and grouping. More recently, several approaches have been proposed that not only model the local regions but also explore the correlation between points in the local regions. From the survey, we conclude that approaches that model local regions and take into account the correlation between points in the local regions perform better. Contrary to existing reviews, this paper provides a general structure for learning with raw point clouds, and various methods were compared based on the general structure. This work also introduces the popular 3D point cloud benchmark datasets and discusses the application of deep learning in popular 3D vision tasks, including classification, segmentation, and detection.

Highlights

We live in a three-dimensional world; since the invention of the camera, visual information of the 3D world has been projected onto 2D images
The performance of the methods is reviewed on popular benchmark datasets: the Modelnet40 dataset [48] for classification; ShapeNet [87] and Stanford 3D Indoor Semantics Dataset (S3DIS) [101] for parts and semantic segmentation, respectively; the ScanNet [64] benchmark for 3D Semantic instance segmentation; and the KITTI dataset [111,112] for object detection
The increasing availability of point clouds as a result of the evolution of scanning devices coupled with their increasing application in autonomous vehicles, robotics, augmented reality (AR) and virtual reality(VR), etc., demands fast and efficient algorithms for their processing to achieve improved visual perception, such as recognition, segmentation, and detection

Summary

Introduction

We live in a three-dimensional world; since the invention of the camera, visual information of the 3D world has been projected onto 2D images. 3D point cloud data have become popular as a result of the increasing availability of sensing devices, especially light detection and ranging (LiDAR)-based devices such as Tele-15 [8], Leica BLK360 [9], Kinect V2 [10], etc., and, more recently, mobile phones with a time of flight (tof) depth camera. These sensing devices allow the easy acquisition of the 3D world in 3D point clouds.

Methodology

Challenges of Deep Learning with Point Clouds

Structured Grid-Based Learning

Voxel-Based Approach

Multi-View-Based Approach

Higher-Dimensional Lattices

Deep Learning Directly with a Raw Point Cloud

PointNet

Approaches with Local Structure Computation

Approaches That Do Not Explore Local Correlation

Approaches That Explore Local Correlation

Graph-Based Approaches

Summary

Method

Benchmark Datasets

ModelNet

ShapeNet

Augmenting ShapeNet

Shape2Motion

ScanObjectNN

NYUDv2

SceneNN

ScanNet

Matterport3D

Multisensor Indoor Mapping and Positioning Dataset

ASL Dataset

Oxford Robotcar

Semantic3D

Apollo

6.3.12. Whu-TLS

Application of Deep Learning in 3D Vision Tasks

Classification

Segmentation

Object Detection

Findings

Summary and Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Remote Sensing	Publication Date: May 28, 2020
Citations: 178	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Review: Deep Learning on 3D Point Clouds

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Remote Sensing

Lead the way for us

Similar Papers

Edge extraction by merging 3D point cloud and 2D image data
Daniel Schilberg ... Sabina Jeschke
-
Daniel Schilberg, et. al.Daniel Schilberg ... Sabina Jeschke
01 Oct 2013
01 Oct 2013

Edge Extraction by Merging the 3D Point Cloud and 2D Image Data
Daniel Ewert ... Ying Wang
-
Daniel Ewert, et. al.Daniel Ewert ... Ying Wang
01 Jan 2014
01 Jan 2014

Shape recognition with point clouds in rebars
K Ishida ... N Kano
Gerontechnology | VOL. 11
K Ishida, et. al.K Ishida ... N Kano
14 Jun 2012
Gerontechnology | VOL. 11

Survey on 3D Surface Reconstruction
...
Journal of Information Processing Systems | VOL. 12
, et. al. ...
01 Jan 2015
Journal of Information Processing Systems | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Review: Deep Learning on 3D Point Clouds

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Remote Sensing