Abstract

The two-stream convolutional network (ConvNet) plays a vital role in the development of the deep learning network for activity recognition. Recently, there are many studies about activity recognition using the two-stream network as a powerful feature extractor. The combination of two-stream ConvNet and fully connected long short-term memory (FC-LSTM) and the combination of two-stream ConvNet and temporal segment LSTM had achieved the best performance for activity recognition. In this paper, we are motivated to explore the performance's limit of networks that combine two-stream and recurrent neural network, so we highlight the necessity of maintaining spatial structure throughout the deep learning networks when the sequential data show correlations in space and stress the importance of appropriate fusion method when integrating feature maps and we demonstrate with experiments that these methods work well. Three main contributions can be concluded from our work. First, we propose to combine convolutional LSTM (ConvLSTM) networks with a two-stream ConvNet based on RGB streams and optical streams first. The spatiotemporal features are extracted by a two-stream ConvNet which is pre-trained on the dataset of ImageNet, and then the fused sequential three-dimensional feature maps are classified by the ConvLSTM. Second, we explored the effect of fusing the feature maps of the two-stream network at different layers with different fusing strategy and conclude that appropriate fusing location and fusing method can improve our model to the state-of-art performance. Third, we demonstrated that better overall performance can be achieved, given proper care to the ConvLSTM. Our analysis shows that our proposed network structure can achieve the state-of-art 69.4% accuracy on HMDB51 and 93.9 % accuracy on UCF101 among the methods composed by the ConvNets with the recurrent neural network without pre-training on Kinetics dataset.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.