Wearable Sensor-Based Human Activity Recognition with Transformer Model.

Iveta Dirgová Luptáková,Jiří Pospíchal,Martin Kubovčík

doi:10.3390/s22051911

Abstract

Computing devices that can recognize various human activities or movements can be used to assist people in healthcare, sports, or human–robot interaction. Readily available data for this purpose can be obtained from the accelerometer and the gyroscope built into everyday smartphones. Effective classification of real-time activity data is, therefore, actively pursued using various machine learning methods. In this study, the transformer model, a deep learning neural network model developed primarily for the natural language processing and vision tasks, was adapted for a time-series analysis of motion signals. The self-attention mechanism inherent in the transformer, which expresses individual dependencies between signal values within a time series, can match the performance of state-of-the-art convolutional neural networks with long short-term memory. The performance of the proposed adapted transformer method was tested on the largest available public dataset of smartphone motion sensor data covering a wide range of activities, and obtained an average identification accuracy of 99.2% as compared with 89.67% achieved on the same data by a conventional machine learning method. The results suggest the expected future relevance of the transformer model for human activity recognition.

Highlights

Human activity recognition is an important and popular research area in time series classification
Several types of deep neural networks are typically used for time series classification of sensor signals, such as convolutional neural networks [14–16], fully convolutional neural networks [17], multiscale convolutional neural networks [18], time-LeNet [19], stacked denoising autoencoder [20], deep belief neural networks [21], Long Short-Term
The remainder of this paper is organized as follows: In Section 2, we present the details of the general transformer model, the vision transformer model, the used KU-human activity recognition (HAR)

Summary

Introduction

Human activity recognition is an important and popular research area in time series classification. It aims at identifying human behavior based on data from sensors, available from personal devices such as smartphones, tablets, or smartwatches that can collect data from a wide sample of users and classify the signals using machine learning methods [1]. The technology of detecting human activities using mobile devices has great potential in medicine where it is possible to monitor patients with various diagnoses [2–5]. In addition to health monitoring and rehabilitation, this technology can be used in gaming [10], human–robot interaction and robotics [11,12], and sports [13]. A lot of effort has been focused on human activity recognition by deep neural networks. Several types of deep neural networks are typically used for time series classification of sensor signals, such as convolutional neural networks [14–16], fully convolutional neural networks [17], multiscale convolutional neural networks [18], time-LeNet [19], stacked denoising autoencoder [20], deep belief neural networks [21], Long Short-Term

Methods

Results

Discussion

Conclusion