Abstract

Heart rate is an essential vital sign to evaluate human health. Remote heart monitoring using cheaply available devices has become a necessity in the twenty-first century to prevent any unfortunate situation caused by the hectic pace of life. In this paper, we propose a new method based on the transformer architecture with a multi-skip connection biLSTM decoder to estimate heart rate remotely from videos. Our method is based on the skin color variation caused by the change in blood volume in its surface. The presented heart rate estimation framework consists of three main steps: (1) the segmentation of the facial region of interest (ROI) based on the landmarks obtained by 3DDFA; (2) the extraction of the spatial and global features; and (3) the estimation of the heart rate value from the obtained features based on the proposed method. This paper investigates which feature extractor performs better by captioning the change in skin color related to the heart rate as well as the optimal number of frames needed to achieve better accuracy. Experiments were conducted using two publicly available datasets (LGI-PPGI and Vision for Vitals) and our own in-the-wild dataset (12 videos collected by four drivers). The experiments showed that our approach achieved better results than the previously published methods, making it the new state of the art on these datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call