Abstract

In recent years, many researchers have focused on using convolutional neural networks to perform human activity recognition as evidenced by the emergence of a number of convolutional neural network architectures such as LeNet-5, AlexNet and VGG16 and modern architectures such as ResNet, Inception V3, Inception-ResNet, MobileNet V2, NASNet and PNASNet. The main characteristic of a convolutional neural network (CNN) is its ability to extract features automatically from input images, which facilitates the processes of activity recognition and classification. Convolutional networks indeed derive more relevant and complex features with every additional layer. In addition, CNNs have achieved perfect classification on highly similar activities that were previously extremely difficult to classify. In this paper, we evaluate modern convolutional neural networks in terms of their human activity recognition accuracy, and we compare the results with the state of the art methods. In our research, we used two public data sets, HMDB (Shooting gun, kicking, falling to the floor, punching) and the Weizman dataset (walking, running, jumping, bending, one hand waving, two-hand waving, jumping in place, jumping jack, skipping). Our experimental results indicated that the CNN with NASNet architecture achieves the best performance of the six CNN architectures on both human activity data sets (HMDB and Weizman).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call