Multi-view human pose and shape estimation via mesh-aligned voxel interpolation

Yixuan Zhang,Jiguang Zhang,Shibiao Xu,Jun Xiao

doi:10.1016/j.inffus.2024.102651

Yixuan Zhang, Jiguang Zhang + Show 2 more

https://doi.org/10.1016/j.inffus.2024.102651

Copy DOI

Export

Save

Cite

Journal: Information Fusion

Publication Date: Aug 26, 2024

Abstract
Full-Text
Similar Papers

Abstract

Listen

Although multi-view human pose and shape regression methods have information from other views for complementing and correcting, existing ones still have its own drawback of not fully taking advantage of multi-view setup. Thus they are far from efficiently aligning and merging features in different views. In order to tackle these problems, we propose a multi-view framework where features from all views are well aligned and merged through multi-view voxel aggregation with inverse projection. Our framework highlights three major characteristics. Firstly, we use a multi-view volumetric aggregation module for better prediction by exploiting various information in different-scale feature maps. Secondly, in our framework, instead of using all voxels, a mesh-aligned voxel selection module is proposed to make effective prediction by eliminating redundant background voxels. Lastly, the framework further improves the performance of human body parametric modeling by adopting a dual-branch strategy, where one branch for parametric human model prediction and the other for 3D keypoints prediction. Their mutual influence is critical to the improvement for both tasks. Additionally, we find the scarcity of datasets also hinders the development of multi-view methods, so we propose a approach for creating occlusion datasets specifically for multi-view occlusion case. Experimental results verify the effectiveness of the proposed framework on two benchmarks, Human3.6M and MPI-INF-3DHP.

Full Text