HUMAN4D: A Human-Centric Multimodal Dataset for Motions and Immersive Media

Anargyros Chatzitofis,Prodromos Boutis,Shishir Subramanyam,Leonidas Saroglou,Bart Kevelham,Petros Daras,Stefanos Kollias,Dimitrios Zarpalas,Pablo Cesar,Caecilia Charbonnier,Nikolaos Zioulis,Petros Drakoulis

doi:10.1109/access.2020.3026276

Anargyros Chatzitofis, Prodromos Boutis + Show 10 more

Open Access

PDF Available

https://doi.org/10.1109/access.2020.3026276

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

We introduce HUMAN4D, a large and multimodal 4D dataset that contains a variety of human activities simultaneously captured by a professional marker-based MoCap, a volumetric capture and an audio recording system. By capturing 2 female and $2$ male professional actors performing various full-body movements and expressions, HUMAN4D provides a diverse set of motions and poses encountered as part of single- and multi-person daily, physical and social activities (jumping, dancing, etc.), along with multi-RGBD (mRGBD), volumetric and audio data. Despite the existence of multi-view color datasets captured with the use of hardware (HW) synchronization, to the best of our knowledge, HUMAN4D is the first and only public resource that provides volumetric depth maps with high synchronization precision due to the use of intra- and inter-sensor HW-SYNC. Moreover, a spatio-temporally aligned scanned and rigged 3D character complements HUMAN4D to enable joint research on time-varying and high-quality dynamic meshes. We provide evaluation baselines by benchmarking HUMAN4D with state-of-the-art human pose estimation and 3D compression methods. For the former, we apply 2D and 3D pose estimation algorithms both on single- and multi-view data cues. For the latter, we benchmark open-source 3D codecs on volumetric data respecting online volumetric video encoding and steady bit-rates. Furthermore, qualitative and quantitative visual comparison between mesh-based volumetric data reconstructed in different qualities showcases the available options with respect to 4D representations. HUMAN4D is introduced to the computer vision and graphics research communities to enable joint research on spatio-temporally aligned pose, volumetric, mRGBD and audio data cues. The dataset and its code are available https://tofis.github.io/myurls/human4d.

Highlights

Inhabitance in a 4D world of moving 3D objects of various shapes and colors increases the need to capture and extensively study, analyze and exploit the 4D data around us, especially with the massive development of low-cost sensingThe associate editor coordinating the review of this manuscript and approving it for publication was Nilanjan Dey.devices [1]
To the outcomes on other public datasets, AlphaPose outperforms OpenPose showing higher accuracy both in single- and multi-person benchmarking sets of HUMAND. Even though both methods showcase lower accuracy on the multi-person data of H4D2, which is much more challenging due to the occlusions between the subjects, it is worth noting that the difference between the single- and multi-person results of OpenPose is low (∼ 1.5%), while AlphaPose presents a higher drop of approximately 9%
In order to provide extra information to the reader, along with the results on HUMAN4D, we indicate the related outcomes of the methods to other datasets, i.e. MPII [42] and COCO [56]

Summary

Introduction

Inhabitance in a 4D world of moving 3D objects of various shapes and colors increases the need to capture and extensively study, analyze and exploit the 4D data around us, especially with the massive development of low-cost sensingThe associate editor coordinating the review of this manuscript and approving it for publication was Nilanjan Dey.devices [1]. Volumetric video of humans, captured with the aid of multiple cameras, and scanned 3D characters, animated with the use of motion capture (MoCap) technologies, comprise the core elements for human-centric 4D media production, a domain essential in several technological and industrial sectors. These technologies constitute key elements in immersive experiences that provide remote. A. Chatzitofis et al.: HUMAN4D: A Human-Centric Multimodal Dataset for Motions and Immersive Media virtual presence and co-presence (e.g. XR conferencing [2], XR museums [3], etc.). The experiences are further enhanced by augmenting the virtual and immersive worlds with photorealistic representations that enable highly natural and realistic audiovisual communication between multiple users

Objectives

Methods

Results

Discussion

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 35	License type: CC BY 4.0

R Discovery Prime

HUMAN4D: A Human-Centric Multimodal Dataset for Motions and Immersive Media

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Machine learning for human performance capture from multi-viewpoint video.

-

31 Jan 2019
31 Jan 2019

Evaluation of CNN-Based Human Pose Estimation for Body Segment Lengths Assessment
Saman Vafadar ... Matthieu Boëssé
-
Saman Vafadar, et. al.Saman Vafadar ... Matthieu Boëssé
01 Jan 2019
01 Jan 2019

A self-supervised spatio-temporal attention network for video-based 3D infant pose estimation
Wang Yin ... Ming Yi
Medical Image Analysis | VOL. 96
Wang Yin, et. al.Wang Yin ... Ming Yi
18 May 2024
Medical Image Analysis | VOL. 96

Smart-VPoseNet: 3D human pose estimation models and methods based on multi-view discriminant network
Hao Wang ... Minghui Sun
Knowledge-Based Systems | VOL. 239
Hao Wang, et. al.Hao Wang ... Minghui Sun
24 Dec 2021
Knowledge-Based Systems | VOL. 239

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

HUMAN4D: A Human-Centric Multimodal Dataset for Motions and Immersive Media

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: IEEE Access