Masked Autoencoder for Self-Supervised Pre-training on Lidar Point Clouds

Georg Hess,David Hagerman,Christoffer Petersson,Elias Svensson,Lennart Svensson,Johan Jaxing

doi:10.1109/wacvw58289.2023.00039

Abstract

Masked autoencoding has become a successful pretraining paradigm for Transformer models for text, images, and, recently, point clouds. Raw automotive datasets are suitable candidates for self-supervised pre-training as they generally are cheap to collect compared to annotations for tasks like 3D object detection (OD). However, the development of masked autoencoders for point clouds has focused solely on synthetic and indoor data. Consequently, existing methods have tailored their representations and models toward small and dense point clouds with homogeneous point densities. In this work, we study masked autoencoding for point clouds in an automotive setting, which are sparse and for which the point density can vary drastically among objects in the same scene. To this end, we propose Voxel-MAE, a simple masked autoencoding pre-training scheme designed for voxel representations. We pre-train the backbone of a Transformer-based 3D object detector to reconstruct masked voxels and to distinguish between empty and non-empty voxels. Our method improves the 3D OD performance by 1.75 mAP points and 1.05 NDS on the challenging nuScenes dataset. Further, we show that by pre-training with Voxel-MAE, we require only 40% of the annotated data to outperform a randomly initialized equivalent. Code available at https://github.com/georghess/voxel-mae

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Masked Autoencoder for Self-Supervised Pre-training on Lidar Point Clouds

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

CBi-GNN: Cross-Scale Bilateral Graph Neural Network for 3D Object Detection
Jiaxin Chen ... Jian Yang
IEEE Transactions on Intelligent Transportation Systems | VOL. 23
Jiaxin Chen, et. al.Jiaxin Chen ... Jian Yang
01 Dec 2022
IEEE Transactions on Intelligent Transportation Systems | VOL. 23

POAT-Net: Parallel Offset-Attention Assisted Transformer for 3D Object Detection for Autonomous Driving
Jinyang Wang ... Hongying Yu
IEEE Access | VOL. 9
Jinyang Wang, et. al.Jinyang Wang ... Hongying Yu
01 Jan 2020
IEEE Access | VOL. 9

The Graph Neural Network Detector Based on Neighbor Feature Alignment Mechanism in LIDAR Point Clouds
Xinyi Liu ... Na Liu
Machines | VOL. 11
Xinyi Liu, et. al.Xinyi Liu ... Na Liu
14 Jan 2023
Machines | VOL. 11

Vehicle Detection Based on Structure Perception in Point Cloud
Zongmin Li ... Yujie Liu
Journal of Computer-Aided Design & Computer Graphics | VOL. 33
Zongmin Li, et. al.Zongmin Li ... Yujie Liu
01 Mar 2021
Journal of Computer-Aided Design & Computer Graphics | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Masked Autoencoder for Self-Supervised Pre-training on Lidar Point Clouds

Abstract

Talk to us

Similar Papers