Abstract
Abstract Introduction Over the last 40 years, a variety of algorithms have been proposed to classify sleep-wake from wrist acceleration data. Input features into these algorithms have been activity counts or raw acceleration. The algorithms proposed range from a single heuristic rule to logistic regression to machine learning and deep learning. The purpose of this work is to evaluate and compare the accuracy of these algorithms against polysomnography (PSG) annotations on a common dataset from a sleep laboratory. Methods The Newcastle PSG dataset was used to compare the various sleep-wake prediction algorithms to concurrent PSG annotations in thirty second epochs. This dataset contains 28 patients, 27 of which had wrist acceleration data for both the right and left wrist and one which had data for the left wrist only. Twenty of the participants had at least one sleep disorder. Sleep disorders included idiopathic hypersomnia, restless leg syndrome, sleep apnea, narcolepsy, sleep paralysis, nocturia, obstructive sleep apnea, RBD, parasomnia and insomnia. Results The domain adversarial convolutional neural network (DACNN) model showed the best overall results (sens=83.9, spec=57.6, f1=81.7, WASO-RMSE=80.9). The next best performing model according to WASO-RMSE was Cole-Kripke (sens=81.4, spec=50.3, f1=78.0, WASO-RMSE=90.4). This was followed by the van Hees heuristic model (sens=83.6, spec=47.5, f1=79.1, WASO-RMSE=91.1). The fourth best performing model was the Random Forest model (spec=77.5, sens=55.5, f1=76.4, WASO-RMSE=93.0). The fifth best performing model was Sadeh (sens=82.6, spec=49.7, f1=78.5, WASO-RMSE=93.6). The sixth best performing model was the LSTM model (sens=78.6, spec=58.9, f1=77.8, WASO-RMSE=93.9). The worst performing model was the CNN model (sens=79.8, spec=54.2, f1=77.8, WASO-RMSE=99.8). * sens=sensitivity, spec=specificity, WASO=wake after sleep onset, RMSE=root mean squared error Conclusion The DACNN algorithm outperformed all other algorithms on sleep-wake classification from wrist acceleration data. Despite being vastly different in input features’ types and algorithm complexity, all other algorithms performed similarly on the dataset, with f1 scores ranging from 76.4 to 79.1 and WASO-RMSE ranging from 90.4 to 99.8. Support (if any)
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have