Abstract

Recurrent Neural Networks (RNNs) and their variants have been demonstrated tremendous successes in modeling sequential data such as audio processing, video processing, time series analysis, and text mining. Inspired by these facts, we propose human activity recognition technique to proceed visual data via utilizing convolution neural network (CNN) and Bidirectional-gated recurrent unit (Bi-GRU). Firstly, we extract deep features from frames sequence of human activities videos using CNN and then select most important features from the deep appearances to improve performance and decrease computational complexity of the model. Secondly, to learn temporal motions of frames sequence, we design Bi-GRU and feed those deep-important features extracted from frames sequence of human activities to Bi-GRU which learn temporal dynamics in forward and backward direction at each time step. We conduct extensive experiments on realistic videos of human activity recognition datasets YouTube11, HMDB51 and UCF101. Lastly, we compare the obtained results with existing methods to show the competence of our proposed technique.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call