Abstract

The purpose of this study is learning and classification of video activities using video color and motion information. The video activity labeling is important for many applications such as video content modeling, indexing, and quick access to content. In this study video activity recognition is performed by deep learning. In order to learn visual features of video, Convolutional Neural Network (CNN) layers and a special type of recursive networks, Long-Short Term Memory (LSTM), layers are stacked. Video sequence learning is performed by end-to-end training. Recent works on deep learning employ color end motion information together to improve learning and classification accuracy. In this study, unlike the existing models, video motion content is learned using SIFT flow vectors and motion and color features are fused for activity recognition. Performance tests performed on a commonly used benchmarking data set, UCF 101 which includes activity labeled videos from 101 action categories such as Biking, Playing Guitar, demonstrate that SIFT flow vectors allow us to model motion information more accurately than optical flow vectors and increase video motion classification performance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.