Techniques for binocular markerless visual tracking of 3D articulated bodies

Andrew W B Smith

doi:10.14264/uql.2016.369

Abstract

This thesis advances methods for performing markerless visual tracking of articulated bodies using one or two cameras. The research presented aims to improve upon existing Bayesian inspired tracking methods, by examining the ‘building blocks’ of these tracking algorithms, in particular the measurement function design, the state space selection, and local optimization methods. Results presented in this thesis show that improvements can be made in all of these areas. These improvements are applicable to a variety of Bayesian tracking algorithms. This thesis begins by examining literature relevant to the visual tracking problem. This includes the measurement functions used by other authors, focussing on the edge detection methods used in both tracking and segmentation problems. A general overview of the global search problem is given next, as a global search is a fundamental part of a Bayesian tracking algorithm. The combination of Newton like local optimization methods and the measurement functions used in visual tracking problems is examined next, and it is shown that Newton optimizers are not ideally suited to these measurement functions. The Bayesian tracking framework is then detailed, along with a review of several existing Bayesian tracking algorithms. Finally some non Bayesian tracking algorithms are discussed. Following the literature review, details of the models used in the experiments presented in this thesis are given. These include the articulated human body model, the camera model, image gradient metrics, self occlusion treatment, and a generic colour based region measurement method. The use of graph based approaches for edge measurements is then investigated. Graph based methods are commonly used in image segmentation problems, however have not been applied to visual tracking problems. A novel method for performing edge measurements using the ‘shortest path’ around the object’s occluding contour is presented. Unlike in the segmentation problem, self occlusion models mean the weights or costs of some graph vertices can not be determined. Different treatments for occluded graph vertices are given and evaluated. It is shown that the graph based approach produces observational likelihoods that are more accurate and have significantly fewer local maxima than the edge measurement schemes previously used in tracking problems. While this approach is computationally more expensive than other methods, it is argued that this is offset by the reduced computational expense of the global search procedure used in tracking algorithms. The choice of state space used in the tracking problems is examined next. While most authors have used a state space based on the joint angles of the human body, a Cartesian state space based on the world coordinates of limbs is proposed. While Cartesian based state spaces have been used by other authors for representations of kinematic models, to the author’s knowledge they have not been used for full kinematic models. It is shown that that the more linear relationship between state variables in the Cartesian space and the 3D locations of sampled points on the object improves dynamic model predictions and principal component analysis. It is also shown that the Cartesian formulation also increases the linearity between state variables and the image coordinates of sampled points on the object. This in turn improves the performance of local optimization methods which make localized quadratic approximations to the measurement function. While the Cartesian based space has a higher dimensionality than the rotation based space, the geometrically plausible region of the Cartesian space has the same content (area) as the rotation space, which negates the well known ‘curse of dimensionality’. A simple method is given to project an implausible Cartesian state to a geometrically plausible state, as well as a method to dampen the measurement function curvature in these implausible directions. Following this, a novel local optimization method is proposed. This optimization method is specific to visual tracking problems, and uses the camera geometry to infer interesting search directions. Treatments for choosing these search directions are given for both the monocular and two camera cases. A problem decomposition is also used to reduce the computational cost of the optimizer. This method is shown to outperform Newton based optimizations in a rotation based state space, and gives at worst equivalent results to a Newton based approach in a Cartesian state space, but at a significantly reduced computational cost. Finally, tracking results are presented for a difficult image sequence using the combined ideas presented in this thesis. This sequence is a golfer performing a golf swing, which is a highly dynamic motion with large object velocities and accelerations.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Techniques for binocular markerless visual tracking of 3D articulated bodies

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Trajectory Design for Proximity Operations: The Relative Orbital Elements’ Perspective
Gabriella Gaias ... Marco Lovera
Journal of Guidance, Control, and Dynamics | VOL. 44
Gabriella Gaias, et. al.Gabriella Gaias ... Marco Lovera
21 Sep 2021
Journal of Guidance, Control, and Dynamics | VOL. 44

Visual Object Tracking by Structure Complexity Coefficients
Yuan Yuan ... Weisi Lin
IEEE Transactions on Multimedia | VOL. 17
Yuan Yuan, et. al.Yuan Yuan ... Weisi Lin
01 Aug 2015
IEEE Transactions on Multimedia | VOL. 17

Particle filtering on large dimensional state spaces and applications in computer vision
Samarjit Das
-
Samarjit DasSamarjit Das
30 Apr 2012
30 Apr 2012

Geometric PDE's and Invariants for Problems in Visual Control Tracking and Optimization
Allen R Tannenbaum
-
Allen R TannenbaumAllen R Tannenbaum
03 Jan 2005
03 Jan 2005

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Techniques for binocular markerless visual tracking of 3D articulated bodies

Abstract

Talk to us

Similar Papers