Abstract

We propose a distributed, real-time computing platform for tracking multiple interacting persons in motion. To combat the negative effects of occlusion and articulated motion we use a multiview implementation, where each view is first independently processed on a dedicated processor. This monocular processing uses a predictor-corrector filter to weigh reprojections of three-dimensional (3-D) position estimates, obtained by the central processor, against observations of measurable image motion. The corrected state vectors from each view provide input observations to a Bayesian belief network, in the central processor, with a dynamic, multidimensional topology that varies as a function of scene content and feature confidence. The Bayesian net fuses independent observations from multiple cameras by iteratively resolving independency relationships and confidence levels within the graph, thereby producing the most likely vector of 3-D state estimates given the available data. To maintain temporal continuity, we follow the network with a layer of Kalman filtering that updates the 3-D state estimates. We demonstrate the efficacy of the proposed system using a multiview sequence of several people in motion. Our experiments suggest that, when compared with data fusion based on averaging, the proposed technique yields a noticeable improvement in tracking accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call