Abstract
Activity recognition represents the task of classifying data derived from different sensor types into one of predefined activity classes. The most popular and beneficial sensors in the area of action recognition are inertial sensors such as accelerometer and gyroscope. Convolutional neural network (CNN) as one of the best deep learning methods has recently attracted much attention to the problem of activity recognition, where 1D kernels capture local dependency over time in a series of observations measured at inertial sensors (3-axis accelerometers and gyroscopes) while in 2D kernels apart from time dependency, dependency between signals from different axes of same sensor and also over different sensors will be considered. Most convolutional neural networks used for recognition task are built using convolution and pooling layers followed by a few number of fully connected layers but large and deep neural networks have high computational costs. In this paper, we propose a new architecture that consists solely of convolutional layers and find that with removing the pooling layers and instead adding strides to convolution layers, the computational time will decrease notably while the model performance will not change or in some cases will even improve. Also both 1D and 2D convolutional neural networks with and without pooling layer will be investigated and their performance will be compared with each other and also with some other hand-crafted feature based methods. The third point that will be discussed in this paper is the impact of applying fast fourier transform (FFT) to inputs before training learning algorithm. It will be shown that this preprocessing will enhance the model performance. Experiments on benchmark datasets demonstrate the high performance of proposed 2D CNN model with no pooling layers.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have