Understanding sleep stages is crucial for diagnosing sleep disorders, developing treatments, and studying sleep's impact on overall health. With the growing availability of affordable brain monitoring devices, the volume of collected brain data has increased significantly. However, analyzing these data, particularly when using the gold standard multi-lead electroencephalogram (EEG), remains resource-intensive and time-consuming. To address this challenge, automated brain monitoring has emerged as a crucial solution for cost-effective and efficient EEG data analysis. A critical component of sleep analysis is detecting transitions between wakefulness and sleep states. These transitions offer valuable insights into sleep quality and quantity, essential for diagnosing sleep disorders, designing effective interventions, enhancing overall health and well-being, and studying sleep's effects on cognitive function, mood, and physical performance. This study presents a novel EEG feature extraction pipeline for the accurate classification of various wake and sleep stages. We propose a noise-robust model-based Kalman filtering (KF) approach to track changes in a time-varying auto-regressive model (TVAR) applied to EEG data during different wake and sleep stages. Our approach involves extracting features, including instantaneous frequency and instantaneous power from EEG, and implementing a two-step classifier for sleep staging. The first step classifies data into wake, REM, and non-REM categories, while the second step further classifies non-REM data into N1, N2, and N3 stages. Evaluation on the extended Sleep-EDF dataset (Sleep-EDFx), with 153 EEG recordings from 78 subjects, demonstrated compelling results with classifiers including Logistic Regression, Support Vector Machines, Extreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LGBM). The best performance was achieved with the LGBM and XGBoost classifiers, yielding an overall accuracy of over 77%, a macro-averaged F1 score of 0.69, and a Cohen's kappa of 0.68, highlighting the efficacy of the proposed method with a remarkably compact and interpretable feature set.
Read full abstract