Remote photoplethysmography (rPPG) is a non-contact camera based method that seems to be the most promising technology for remote cardiac health assessment. The conventional rPPG methods have played a significant role in the growth of low-cost camera based cardiac health monitoring system. However, they are based on certain assumptions, and their performance may degrade with real-time dynamic interferences. Recently, deep neural network (DNN) based methods have been introduced in rPPG domain and they proves to perform well in noisy environment. In this work, we proposed a novel 2D-convolution neural network (CNN) based time-frequency learning (TFL) framework to estimate rPPG signal using a scalogram based feature map of facial video. The TFL network estimates the periodic rPPG signal from the feature map that is obtained using the raw RGB traces. The feature map generation involves the enhancement of pulsatile profile of the raw RGB traces with the help of wavelet transformation. The enhanced pulsatile profile in the feature map helps the proposed network to characterize a wide-range of frequencies associated with the cardiac activity. The estimated rPPG signal from the network is used for heart rate (HR) and heart rate variability (HRV) measurements. The proposed network is tested and validated on three publicly available UBFC_rPPG, COHFACE, and ECG-fitness datasets and outperforms the state-of-the-art methods. The proposed TFL framework achieves 0.569 bpm mean absolute error (MAE) and 0.997 Pearson correlation coefficient (ρ) for HR estimation on the UBFC_rPPG dataset. On the COHFACE and ECG-fitness datasets, the MAE and (ρ) pair are found to be (0.987, 0.958) and (4.267, 0.937), respectively. The performance results show the effectiveness of the designed feature map along with the TFL network in the estimation of rPPG waveform for non-contact vitals measurement.
Read full abstract