Abstract

In this paper, we focus on the improvements of appearance representation for multihuman tracking. Many previous methods extracted low-level appearance features, such as color histogram and texture, even combined with spatial information for each frame. These methods ignore the temporal distribution of features. The features of each frame may not be stable due to illumination, human pose variation, and image noise. In order to improve it, we propose a novel appearance representation called the spatial-temporal appearance model based on the statistical distribution of Gaussian mixture model (GMM). It represents the appearance of a tracklet as a whole with dynamic spatial and temporal information. The spatial information is the dynamic subregions. The temporal information is the dynamic duration time of each subregion. Each subregion is modeled as the weighted Gaussian distribution of GMM. The online expectation-maximization (online EM) algorithm is used to estimate the parameters of GMM. Then, we propose a tracklet association method using Bayesian prediction and Jensen-Shannon divergence. The Bayesian prediction is used to predict the locations of targets. The Jensen-Shannon divergence is used to compute the distance of spatial-temporal appearance distribution between two tracklets. Finally, we test our approach on four challenging datasets (TRECVID, CAVIAR, ETH, and EPFL Terrace) and achieve good results.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.