Collaborative Track Analysis, Data Cleansing, and Labeling

George Kamberov,Lazaros Karydas,Matt Burlick,Gerda Kamberova,Bart Luczynski

doi:10.1007/978-3-642-24028-7_66

Abstract

Tracking output is a very attractive source of labeled data sets that, in turn, could be used to train other systems for tracking, detection, recognition and categorization. In this context, long tracking sequences are of particular importance because they provide richer information, multiple views, wider range of appearances. This paper addresses two obstacles to the use of tracking data for training: noise in the tracking data and the unreliability and slow pace of hand labeling. The paper introduces a criterion for detecting inconsistencies (noise) in large data collections and a method for selecting typical representatives of consistent collections. Those are used to build a pipeline which cleanses the tracking data and employs instantaneous (shotgun) labeling of vast numbers of images. The shotgun labeled data is shown to compare favorably with hand labeled data when used in classification tasks. The framework is collaborative - it involves a human-in-the loop but it is designed to minimize the burden on the human.

Full Text