Multiview 3D reconstruction and human point cloud classification

Sarah Ershadi Nasab,Ali Ossia,Majid Mobini,Shohreh Kasaei,Esmaeil Sanaei

doi:10.1109/iraniancee.2014.6999703

Abstract

An efficient method for human point cloud classification to semantic parts is presented. Using multiview frames, the 3D point cloud is extracted by 3D reconstruction and structure from motion methods. Bundle adjustment method is used for obtaining camera position and 3D point cloud by minimizing the reprojection error. For semantically classifying this point cloud to human limbs the conditional random field (CRF) and the mean field approximation are used. For reducing computational complexity in message passing stage (because of a huge number of nodes related to 3d point cloud), the over-segmentation method and the voxel cloud connectivity segmentation (VCCS) that voxelisizes the 3D point cloud to the over segmented parts are used. Here, we use the fully connected CRF graph on voxels instead of single point cloud points. The pair wise potentials for this CRF are combinations of Gaussian kernels of normal, positions, and colors. Gaussian kernels are appearance, shape, smoothness and Geodesic distance. Appearance kernel is inspired by the observation that nearby pixels with similar color are likely to be in the same class. The smoothness kernel removes small isolated regions. The shape kernel is a Gaussian kernel of normal differences. The Geodesic kernel is shortest path with Dijkstra algorithm between meshes. The inference function is a weighted combination of Gaussians. The unary potentials are prior probability for each limb that have the related label. The 6D pose invariant features such as FFPH for obtaining the discriminative features in whole body parts are used for unary potentials in CRF model. The experimental results show the efficiency of the proposed method.

Full Text