Abstract
Estimating 3D human mesh is appealing for various application scenarios. Current mainstream solution predicts the meshes either from the image or the human reflected RF-signals. In this paper, instead of investigating which approach is better, we propose to design a multi-modality fusion framework, namely MI-Mesh, which estimates 3D meshes by fusing image and mmWave. To realize this, we design a deep neural network model. It first automatically correlate mmWave point clouds to certain human joints and extracts useful fused features from two modalities. Then, the features are refined by predicting 2D joints and silhouette. Finally, we regress pose and shape parameters and feed them to SMPL model to generate the 3D human meshes. We build a prototype on commercial mmWave radar and camera. The experimental results demonstrate that with the integration of multi-modality strengths, MI-Mesh can effectively recover human meshes on dynamic motions and across different conditions.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have