Sleep stage classification is of great importance in human health monitoring and disease diagnosing. Clinically, visual-inspected classifying sleep into different stages is quite time consuming and highly relies on the expertise of sleep specialists. Many automated models for sleep stage classification have been proposed in previous studies but their performances still exist a gap to the real clinical application. In this work, we propose a novel multi-view fusion network named MVF-SleepNet based on multi-modal physiological signals of electroencephalography (EEG), electrocardiography (ECG), electrooculography (EOG), and electromyography (EMG). To capture the relationship representation among multi-modal physiological signals, we construct two views of Time-frequency images (TF images) and Graph-learned graphs (GL graphs). To learn the spectral-temporal representation from sequentially timed TF images, the combination of VGG-16 and GRU networks is utilized. To learn the spatial-temporal representation from sequentially timed GL graphs, the combination of Chebyshev graph convolution and temporal convolution networks is employed. Fusing the spectral-temporal representation and spatial-temporal representation can further boost the performance of sleep stage classification. A large number of experiment results on the publicly available datasets of ISRUC-S1 and ISRUC-S3 show that the MVF-SleepNet achieves overall accuracy of 0.821, F1 score of 0.802 and Kappa of 0.768 on ISRUC-S1 dataset, and accuracy of 0.841, F1 score of 0.828 and Kappa of 0.795 on ISRUC-S3 dataset. The MVF-SleepNet achieves competitive results on both datasets of ISRUC-S1 and ISRUC-S3 for sleep stage classification compared to the state-of-the-art baselines. The source code of MVF-SleepNet is available on Github (https://github.com/YJPai65/MVF-SleepNet).
Read full abstract