Peripheral Capillary Oxygen Saturation (SpO2) has received increasing attention during the COVID-19 pandemic. Clinical investigations have demonstrated that individuals afflicted with COVID-19 exhibit notably reduced levels of SpO2 before the deterioration of their health status. To cost-effectively enable individuals to monitor their SpO2, this paper proposes a novel neural network model named “ITSCAN” based on Temporal Shift Module. Benefiting from the widespread use of smartphones, this model can assess an individual’s SpO2 in real time, utilizing standard facial video footage, with a temporal granularity of seconds. The model is interweaved by two distinct branches: the motion branch, responsible for extracting spatiotemporal data features and the appearance branch, focusing on the correlation between feature channels and the location information of feature map using coordinate attention mechanisms. Accordingly, the SpO2 estimator generates the corresponding SpO2 value. This paper summarizes for the first time 5 loss functions commonly used in the SpO2 estimation model. Subsequently, a novel loss function has been contributed through the examination of various combinations and careful selection of hyperparameters. Comprehensive ablation experiments analyze the independent impact of each module on the overall model performance. Finally, the experimental results based on the public dataset (VIPL-HR) show that our model has obvious advantages in MAE (1.10%) and RMSE (1.19%) compared with related work, which implies more accuracy of the proposed method to contribute to public health.
Read full abstract