Abstract

Background and aimsMachine Learning is transforming data processing in medical research and clinical practice. Missing data labels are a common limitation to training Machine Learning models. To overcome missing labels in a large dataset of microneurography recordings, a novel autoencoder based semi-supervised, iterative group-labelling methodology was developed. MethodsAutoencoders were systematically optimised to extract features from a dataset of 478621 signal excerpts from human microneurography recordings. Selected features were clusters with k-means and randomly selected representations of the corresponding original signals labelled as valid or non-valid muscle sympathetic nerve activity (MSNA) bursts in an iterative, purifying procedure by an expert rater. A deep neural network was trained based on the fully labelled dataset. ResultsThree autoencoders, two based on fully connected neural networks and one based on convolutional neural network, were chosen for feature learning. Iterative clustering followed by labelling of complete clusters resulted in all 478621 signal peak excerpts being labelled as valid or non-valid within 13 iterations. Neural networks trained with the labelled dataset achieved, in a cross validation step with a testing dataset not included in training, on average 93.13% accuracy and 91% area under the receiver operating curve (AUC ROC). DiscussionThe described labelling procedure enabled efficient labelling of a large dataset of physiological signal based on expert ratings. The procedure based on autoencoders may be broadly applicable to a wide range of datasets without labels that require expert input and may be utilised for Machine Learning applications if weak-labels were available.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call