Abstract

People, cars and other moving objects in videos generate time series data that can be labeled in many ways. For example, classifiers can label motion tracks according to the object type, the action being performed, or the trajectory of the motion. These labels can be generated for every frame as long as the object stays in view, so object tracks can be modeled as Markov processes with multiple noisy observation streams. A challenge in video recognition is to recover the true state of the track (i.e. its class, action and trajectory) using Markov models without (a) counter-factually assuming that the streams are independent or (b) creating a fully coupled Hidden Markov Model (FCHMM) with an infeasibly large state space. This paper introduces a new method for labeling sequences of hidden states. The method exploits external consistency constraints among streams without modeling complex joint distributions between them. For example, common sense semantics suggest that trees cannot walk. This is an example of an external constraint between an object label (“tree”) and an action label (“walk”). The key to exploiting external constraints is a new variation of the Viterbi algorithm which we call the Viterbi–Segre (VS) algorithm. VS restricts the solution spaces of factorized HMMs to marginal distributions that are compatible with joint distributions satisfying sets of external constraints. Experiments on synthetic data show that VS does a better job of estimating true states with the given observations than the traditional Viterbi algorithm applied to (a) factorized HMMs, (b) FCHMMs, or (c) partially-coupled HMMs that model pairwise dependencies. We then show that VS outperforms factorized and pairwise HMMs on real video data sets for which FCHMMs cannot feasibly be trained.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.