Abstract

In this paper, a novel generalized framework of activity representation and recognition based on a ‘string of feature graphs (SFG)’ model is introduced. The proposed framework represents a visual activity as a string of feature graphs, where the string elements are initially matched using a graph-based spectral technique, followed by a dynamic programming scheme for matching the complete strings. The framework is motivated by success of time sequence analysis approaches in speech recognition, but modified in order to capture the spatio-temporal properties of individual actions, the interactions between objects, and speed of activity execution. This framework can be adapted to various spatio-temporal motion features, and we show details on using STIP features and track features. Furthermore, we show how this SFG model can be embedded within a switched dynamical system (SDS) that is able to automatically choose the most efficient features for a particular video segment. This allows us to analyze a variety of activities in natural videos in a computationally efficient manner. Experimental results on the basic SFG model as well as its integration with the SDS are shown on some of the most challenging multi-object datasets available to the activity analysis community.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.