Abstract
The sliding window technique is widely used to segment inertial sensor signals, i.e., accelerometers and gyroscopes, for activity recognition. In this technique, the sensor signals are partitioned into fix sized time windows which can be of two types: (1) non-overlapping windows, in which time windows do not intersect, and (2) overlapping windows, in which they do. There is a generalized idea about the positive impact of using overlapping sliding windows on the performance of recognition systems in Human Activity Recognition. In this paper, we analyze the impact of overlapping sliding windows on the performance of Human Activity Recognition systems with different evaluation techniques, namely, subject-dependent cross validation and subject-independent cross validation. Our results show that the performance improvements regarding overlapping windowing reported in the literature seem to be associated with the underlying limitations of subject-dependent cross validation. Furthermore, we do not observe any performance gain from the use of such technique in conjunction with subject-independent cross validation. We conclude that when using subject-independent cross validation, non-overlapping sliding windows reach the same performance as sliding windows. This result has significant implications on the resource usage for training the human activity recognition systems.
Highlights
Wearable sensors and mobile devices are transforming society at a rapid pace, creating a wide range of opportunities for knowledge extraction from new data sources
We report distributions of average F1 score values obtained across all validation folds
We conclude that the suggested use of overlapping sliding windows in Human Activity Recognition (HAR) systems is associated with underlying limitations of subject-dependent CV
Summary
Wearable sensors and mobile devices are transforming society at a rapid pace, creating a wide range of opportunities for knowledge extraction from new data sources. Most HAR systems use an Activity Recognition Process (ARP) to detect activities These systems usually consist of one or more inertial sensors attached to different parts of a person’s body that provide diverse streams of sensor data. Such data streams, subsequently, are segmented into several time windows with specific lengths and from which feature vectors are extracted and fed to a classifier. ARP is composed of a sequence of signal processing, pattern recognition, and machine learning techniques [15] It mainly consists of 5 steps, shown in Figure 1 and explained hereafter. Sensors discretize signals at a given frequency, typically 50 Hz for daily activities or 200 Hz for fast sports, and transmit the resulting data points to the receiver
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have