Abstract
One of the driving forces of behavior recognition in video is the analysis of surveillance video. In this video, humans are monitored and their actions are classified as being normal or a deviation from the norm. Local spatio-temporal features have gained attention to be an effective descriptor for action recognition in video. The problem of using texture as local descriptor is relatively unexplored. In this paper, a work on human action recognition in video is presented by proposing a fusion of appearance, motion and texture as local descriptor for the bag-of-feature model. Rigorous experiments was conducted on the recorded UTP dataset using the proposed descriptor. The average accuracy obtained was 85.92% for the fused descriptor as compared to 75.06% for the combination of shape and motion descriptor. The result shows an improved performance for the proposed descriptor over the combination of appearance and motion as local descriptor of an interest point.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.