PhysFormer++: Facial Video-Based Physiological Measurement with SlowFast Temporal Difference Transformer

Zitong Yu,Philip Torr,Yawen Cui,Guoying Zhao,Jiehua Zhang,Yuming Shen,Jingang Shi,Hengshuang Zhao

doi:10.1007/s11263-023-01758-1

Abstract

Remote photoplethysmography (rPPG), which aims at measuring heart activities and physiological signals from facial video without any contact, has great potential in many applications (e.g., remote healthcare and affective computing). Recent deep learning approaches focus on mining subtle rPPG clues using convolutional neural networks with limited spatio-temporal receptive fields, which neglect the long-range spatio-temporal perception and interaction for rPPG modeling. In this paper, we propose two end-to-end video transformer based architectures, namely PhysFormer and PhysFormer++, to adaptively aggregate both local and global spatio-temporal features for rPPG representation enhancement. As key modules in PhysFormer, the temporal difference transformers first enhance the quasi-periodic rPPG features with temporal difference guided global attention, and then refine the local spatio-temporal representation against interference. To better exploit the temporal contextual and periodic rPPG clues, we also extend the PhysFormer to the two-pathway SlowFast based PhysFormer++ with temporal difference periodic and cross-attention transformers. Furthermore, we propose the label distribution learning and a curriculum learning inspired dynamic constraint in frequency domain, which provide elaborate supervisions for PhysFormer and PhysFormer++ and alleviate overfitting. Comprehensive experiments are performed on four benchmark datasets to show our superior performance on both intra- and cross-dataset testings. Unlike most transformer networks needed pretraining from large-scale datasets, the proposed PhysFormer family can be easily trained from scratch on rPPG datasets, which makes it promising as a novel transformer baseline for the rPPG community.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Computer Vision	Publication Date: Feb 15, 2023
Citations: 36	License type: open-access

R Discovery Prime

R Discovery Prime

PhysFormer++: Facial Video-Based Physiological Measurement with SlowFast Temporal Difference Transformer

Abstract

Talk to us

Similar Papers

More From: International Journal of Computer Vision

Lead the way for us

Similar Papers

PhysFormer: Facial Video-based Physiological Measurement with Temporal Difference Transformer
Zitong Yu ... Yuming Shen
-
Zitong Yu, et. al.Zitong Yu ... Yuming Shen
01 Jun 2022
01 Jun 2022

AutoHR: A Strong End-to-End Baseline for Remote Heart Rate Measurement With Neural Searching
Zitong Yu ... Guoying Zhao
IEEE Signal Processing Letters | VOL. 27
Zitong Yu, et. al.Zitong Yu ... Guoying Zhao
01 Jan 2020
IEEE Signal Processing Letters | VOL. 27

Heart rate prediction from facial video with masks using eye location and corrected by convolutional neural networks
Kun Zheng ... Jinling Cui
Biomedical Signal Processing and Control | VOL. 75
Kun Zheng, et. al.Kun Zheng ... Jinling Cui
09 Mar 2022
Biomedical Signal Processing and Control | VOL. 75

Remote Heart Rate Measurement From Highly Compressed Facial Videos: An End-to-End Deep Learning Solution With Video Enhancement
Zitong Yu ... Guoying Zhao
-
Zitong Yu, et. al.Zitong Yu ... Guoying Zhao
01 Oct 2019
01 Oct 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

PhysFormer++: Facial Video-Based Physiological Measurement with SlowFast Temporal Difference Transformer

Abstract

Talk to us

Similar Papers

More From: International Journal of Computer Vision