Abstract

Person group detection refers to the grouping of people with similar spatio-temporal trajectories. In this work, we address the problem of automatically detecting groups of people from disjoint camera views, which has an essential application in public safety but has not been seriously studied. The main challenge of this task is that the sparse distribution of cameras in a large surveillance area makes it difficult to infer and match people’s trajectories across cameras. To address this challenge, we propose a CCRF (Cyclic Conditional Random Fields) based model for cross-camera trajectory extraction, which takes both visual appearance and heterogeneous spatio-temporal information (including camera locations, video capture times, and the map of the surveillance area) as input and infers multiple candidate cross-camera trajectories for each person. Then, for each pair of people, we propose to use a dynamic trajectory warping (DTW) method to measure the similarity of their trajectories. DTW uses visual features to optimize the selection of trajectories and addresses the problem of trajectory length matching. Since there is no existing dataset that can directly support our research, we enrich our previously built Person Trajectory Dataset by adding the person group annotation and then verify the effectiveness of the proposed method on this dataset. The dataset and code are released at https://github.com/zhangxin1995/PTD_GROUP.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call