Self-Supervised Point Cloud Representation Learning via Separating Mixed Shapes

Chao Sun,Xiaohan Wang,Zhedong Zheng,Mingliang Xu,Yi Yang

doi:10.1109/tmm.2022.3206664

Abstract

The manual annotation for large-scale point clouds costs a lot of time and is usually unavailable in harsh real-world scenarios. Inspired by the great success of the pre-training and fine-tuning paradigm in both vision and language tasks, we argue that pre-training is one potential solution for obtaining a scalable model to 3D point cloud downstream tasks as well. In this paper, we, therefore, explore a new self-supervised learning method, called Mixing and Disentangling ( <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">MD</b> ), for 3D point cloud representation learning. As the name implies, we mix two input shapes and demand the model learning to separate the inputs from the mixed shape. We leverage this reconstruction task as the pretext optimization objective for self-supervised learning. There are two primary advantages: 1) Compared to prevailing image datasets, e.g., ImageNet, point cloud datasets are <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">de facto</i> small. The mixing process can provide a much larger online training sample pool. 2) On the other hand, the disentangling process motivates the model to mine the geometric prior knowledge, e.g., key points. To verify the effectiveness of the proposed pretext task, we build one baseline network, which is composed of one encoder and one decoder. During pre-training, we mix two original shapes and obtain the geometry-aware embedding from the encoder, then an instance-adaptive decoder is applied to recover the original shapes from the embedding. Albeit simple, the pre-trained encoder can capture the key points of an unseen point cloud and surpasses the encoder trained from scratch on downstream tasks. The proposed method has improved the empirical performance on both ModelNet-40 and ShapeNet-Part datasets in terms of point cloud classification and segmentation tasks. We further conduct ablation studies to explore the effect of each component and verify the generalization of our proposed strategy by harnessing different backbones.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Self-Supervised Point Cloud Representation Learning via Separating Mixed Shapes

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia

Lead the way for us

Journal: IEEE Transactions on Multimedia	Publication Date: Jan 1, 2023
Citations: 16

Similar Papers

DCCN: A dual-cross contrastive neural network for 3D point cloud representation learning
Xiaopeng Wu ... Xiaoli Yan
Expert Systems With Applications | VOL. 249
Xiaopeng Wu, et. al.Xiaopeng Wu ... Xiaoli Yan
25 Feb 2024
Expert Systems With Applications | VOL. 249

Benchmarking Self-Supervised Contrastive Learning Methods for Image-Based Plant Phenotyping.
Franklin C Ogidi ... Ian Stavness
Plant phenomics (Washington, D.C.) | VOL. 5
Franklin C Ogidi, et. al.Franklin C Ogidi ... Ian Stavness
01 Jan 2023
Plant phenomics (Washington, D.C.) | VOL. 5

Self-supervised Visual Attribute Learning for Fashion Compatibility
Donghyun Kim ... Stan Sclaroff
-
Donghyun Kim, et. al.Donghyun Kim ... Stan Sclaroff
01 Oct 2021
01 Oct 2021

A Novel Multi-Task Self-Supervised Representation Learning Paradigm
Yinggang Li ... Qi Zhang
Control theory & applications | VOL. -
Yinggang Li, et. al.Yinggang Li ... Qi Zhang
28 May 2021
Control theory & applications | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Self-Supervised Point Cloud Representation Learning via Separating Mixed Shapes

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia