Recognition of 3D Shapes Based on 3V-DepthPano CNN

Junjie Yin,Ningning Huang,Jing Tang,Meie Fang

doi:10.1155/2020/7584576

Junjie Yin, Ningning Huang + Show 2 more

Open Access

https://doi.org/10.1155/2020/7584576

Copy DOI

Abstract

This paper proposes a convolutional neural network (CNN) with three branches based on the three-view drawing principle and depth panorama for 3D shape recognition. The three-view drawing principle provides three key views of a 3D shape. A depth panorama contains the complete 2.5D information of each view. 3V-DepthPano CNN is a CNN system with three branches designed for depth panoramas generated from the three key views. This recognition system, i.e., 3V-DepthPano CNN, applies a three-branch convolutional neural network to aggregate the 3D shape depth panorama information into a more compact 3D shape descriptor to implement the classification of 3D shapes. Furthermore, we adopt a fine-tuning technique on 3V-DepthPano CNN and extract shape features to facilitate the retrieval of 3D shapes. The proposed method implements a good tradeoff state between higher accuracy and training time. Experiments show that the proposed 3V-DepthPano CNN with 3 views obtains approximate accuracy to MVCNN with 12/80 views. But the 3V-DepthPano CNN frame takes much shorter time to obtain depth panoramas and train the network than MVCNN. It is superior to all other existing advanced methods for both classification and shape retrieval.

Highlights

As a typical deep learning method, DeepPano [12] shows clear improvement over other traditional 2D view-based methods
For a certain class of 3D objects, the 3D shape can be uniquely reconstructed from the three views. e process of obtaining 3 views of depth panoramas is fast. e proposed 3V-DepthPano convolutional neural network (CNN) method adopts this principle and selects the three key views to provide maximal projective representation with the minimal number of 2D views
A three-branch convolutional neural network, similar to MVCNN [13], is designed to train the model to generate a high-precision descriptor which can be used for model classification and 3D shape retrieval

Summary

View and Panorama Generation

In order to generate depth panorama of a 3D shape, we project its depth information to a cylinder surface whose central axis is parallel to the principal axis of the 3D object. (6) Unfold the cylinder surface grid into a 2D panorama with depth information: because the cylinder surface is a type of ruled surfaces, it is convenient to be unfolded We cut it open along θ 0∘ and obtain a (M + 1) × (M + 1) matrix whose elements are the values of Dp on the corresponding grid point. Before the extraction of the image features, the pretrained CNN model parameters need to be fine-tuned with the target image set. E method of parameter setting in fine-tuning is similar to that of the 2nd step of the pretraining process. E difference is that different learning rates are set for different network layers in fine-tuning. A small learning rate is set for the first 7 layers to avoid destroying the parameters obtained from the pretrained CNN model. The learning rate is updated once for every 20 epoches of iterations and is reduced to 0.1 times of its original value

Experiments

Method

Findings

Conclusion and Future Work