A Stochastic Attribute Grammar for Robust Cross-View Human Tracking

Xiaobai Liu,Yadong Mu,Yuanlu Xu,Lei Zhu

doi:10.1109/tcsvt.2017.2781738

Abstract

In computer vision, tracking humans across camera views remain challenging, especially for complex scenarios with frequent occlusions, significant lighting changes, and other difficulties. Under such conditions, most existing appearance and geometric cues are not reliable enough to distinguish humans across camera views. To address these challenges, this paper presents a stochastic attribute grammar model for leveraging complementary and discriminative human attributes for enhancing cross-view tracking. The key idea of our method is to introduce a hierarchical representation, parse graph , to describe a subject and its movement trajectory in both space and time domains. These results in a hierarchical compositional representation, comprising trajectory entities of varying level, including human boxes, 3D human boxes, tracklets, and trajectories. We use a set of grammar rules to decompose a graph node (e.g., tracklet) into a set of children nodes (e.g., 3D human boxes), and augment each node with a set of attributes, including geometry (e.g., moving speed and direction), accessories (e.g., bags), and/or activities (e.g., walking and running). These attributes serve as valuable cues, in addition to appearance features (e.g., colors), in determining the associations of human detection boxes across cameras. In particular, the attributes of a parent node are inherited by its children nodes, resulting in consistency constraints over the feasible parse graph. Thus, we cast cross-view human tracking as finding the most discriminative parse graph for each subject in videos. We develop a learning method to train this attribute grammar model from weakly supervised training data. To infer the optimal parse graph and its attributes, we develop an alternative parsing method that employs both top-down and bottom-up computations to search the optimal solution. We also explicitly reason the occlusion status of each entity in order to deal with significant changes of camera viewpoints. We evaluate the proposed method over public video benchmarks, and demonstrate with extensive experiments that our method clearly outperforms the state-of-the-art tracking methods.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Circuits and Systems for Video Technology	Publication Date: Oct 1, 2018
Citations: 62	License type: publisher-specific, author manuscript

R Discovery Prime

R Discovery Prime

A Stochastic Attribute Grammar for Robust Cross-View Human Tracking

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Circuits and Systems for Video Technology

Lead the way for us

Similar Papers

MSST-ResNet: Deep multi-scale spatiotemporal features for robust visual object tracking
Bing Liu ... Yong Yang
Knowledge-Based Systems | VOL. 164
Bing Liu, et. al.Bing Liu ... Yong Yang
09 Nov 2018
Knowledge-Based Systems | VOL. 164

Acquiring Semantically Meaningful Models for Robotic Localization, Mapping and Target Recognition
Jana Kosecka
-
Jana KoseckaJana Kosecka
21 Dec 2014
21 Dec 2014

Graph Attention Networks and Track Management for Multiple Object Tracking
Yajuan Zhang ... Changmiao Wang
Electronics | VOL. 12
Yajuan Zhang, et. al.Yajuan Zhang ... Changmiao Wang
28 Sep 2023
Electronics | VOL. 12

Classification of bird species from video using appearance and motion features
John Atanbori ... Patrick Dickinson
Ecological Informatics | VOL. 48
John Atanbori, et. al.John Atanbori ... Patrick Dickinson
18 Jul 2018
Ecological Informatics | VOL. 48

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Stochastic Attribute Grammar for Robust Cross-View Human Tracking

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Circuits and Systems for Video Technology