Abstract

Estimating 3D human mesh is appealing for various application scenarios. Current mainstream solution predicts the meshes either from the image or the human reflected RF-signals. In this paper, instead of investigating which approach is better, we propose to design a multi-modality fusion framework, namely MI-Mesh, which estimates 3D meshes by fusing image and mmWave. To realize this, we design a deep neural network model. It first automatically correlate mmWave point clouds to certain human joints and extracts useful fused features from two modalities. Then, the features are refined by predicting 2D joints and silhouette. Finally, we regress pose and shape parameters and feed them to SMPL model to generate the 3D human meshes. We build a prototype on commercial mmWave radar and camera. The experimental results demonstrate that with the integration of multi-modality strengths, MI-Mesh can effectively recover human meshes on dynamic motions and across different conditions.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call