Detection and tracking of humans from an airborne platform

Judith Dijk,Adam W M Van Eekeren,Gertjan Burghouts

doi:10.1117/12.2067568

Abstract

Airborne platforms are recording large amounts of video data. Extracting the events which are needed to see is a timedemanding task for analysts. The reason for this is that the sensors record hours of video data in which only a fraction of the footage contains events of interest. For the analyst, it is hard to retrieve such events from the large amounts of video data by hand. A way to extract information more automatically from the data is to detect all humans within the scene. This can be done in a real-time scenario (both on-board as on the ground station) for strategic and tactical purposes and in an offline scenario where the information is analyzed after recording to acquire intelligence (e.g. a daily life pattern). In this paper, we evaluate three different methods for object detection from a moving airborne platform. The first one is a static person detection algorithm. The main advantage of this method is that it can be used on single frames, and therefor does not depend on the stabilization of the platform. The main disadvantage of this method is that the number of pixels needed for the detection is pretty large. The second method is based on detection of motion-in-motion. Here the background is stabilized, and clusters of pixels that move with respect to this stabilized background are detected as moving object. The main advantage is that all moving objects are detected, the main disadvantage is that it heavily depends on the quality of the stabilization. The third method combines both previous detection methods. The detections are tracked using a histogram-based tracker, so that missed detections can be filled in and a trajectory of all objects can be determined. We demonstrate the tracking performance using the three different detections methods on the publicly available UCF-ARG aerial dataset. The performance is evaluated for two human actions (running and digging) and varying object sizes. It is shown that a combined detection approach (static person detection and motion-inmotion detection) gives better tracking results for both human actions than using one of the detectors alone. Furthermore it can be concluded that the minimal height of humans must be 20 pixels to guarantee a good tracking performance.

Full Text