Deep face segmentation for improved heart and respiratory rate estimation from videos

Marc-André Fiedler,Michał Rapczyński,Philipp Werner,Ayoub Al-Hamadi

doi:10.1007/s12652-023-04607-8

Marc-André Fiedler, Michał Rapczyński + Show 2 more

Open Access

https://doi.org/10.1007/s12652-023-04607-8

Copy DOI

Abstract

The selection of a suitable region of interest (ROI) is of great importance in camera-based vital signs estimation, as it represents the first step in the processing pipeline. Since all further processing relies on the quality of the signal extracted from the ROI, the tracking of this area is decisive for the performance of the overall algorithm. To overcome the limitations of classical approaches for the ROI, such as partial occlusions or illumination variations, a custom neural network for pixel-precise face segmentation called FaSeNet was developed. It achieves better segmentation results on two datasets compared to state-of-the-art architectures while maintaining high execution efficiency. Furthermore, the Matthews Correlation Coefficient was proposed as a loss function providing a better fitting of the network weights than commonly applied losses in the field of multi-class segmentation. In an extensive evaluation with a variety of algorithms for vital signs estimation, our FaSeNet was able to achieve better results in both heart and respiratory rate estimation. Thus, a ROI for vital signs estimation could be created that is superior to other approaches.

Full Text