Abstract

In this paper, we employ ideas grounded in physics to examine activities in video. We build the Multi-Resolution Phase Space (MRPS) descriptor, which is a set of feature descriptors that is able to represent complex activities in multiple domains directly from tracks without the need for different heuristics. MRPS is used to do single- and multi-object activity modelling in phase space, which consists of all possible values of the coordinates. The MRPS contains the Sethi Metric (S-Metric), the Hamiltonian Energy Signature (HES), and the Multiple Objects, Pairwise Analysis (MOPA) descriptors: the S-Metric is a distance metric which characterizes the global motion of the object, or the entire scene, with a single, scalar value; the HES is a scalar or multi-dimensional time-series that represents the motion of an object over the course of an activity using either the Hamiltonian or the S-Metric; and the MOPA contains phase space features for paired activities, in which we develop physical models of complex interactions in phase space (specifically, we model paired motion as a damped oscillator in phase space). Finally, we show the S-Metric is a proper distance measure over a metric space and prove its additivity; this allows use of the S-Metric as a distance measure as well as its use in the HES. Experimental validation of the theory is provided on the standard VIVID and UCR Videoweb datasets capturing a variety of problem settings: single agent actions, multi-agent actions, and aerial sequences, including video search.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call