Abstract

How to make an online tracking model effectively adapt to newly appearing objects and object disappearance as well as appearance variations of target objects from few examples is an essential issue in multiple object tracking (MOT). Learning target appearances from few examples is a few-shot classification problem, while identifications of newly appearing objects and object disappearance has the aspect of open-set classification. In this work, we regard online MOT as open-set few-show classification to address both learning from few examples (few-shot classification) and unknown classes such as new objects (open-set classification). Specifically, we develop an embedding neural network, called VOFNet, consisting of convolutional and recurrent parts, to perform open-set few-shot classification. The convolutional part constructs a feature from an example of a target object and the recurrent part determines a representative feature of a target object from few examples. Then VOFNet is trained to provide effective features for open-set few-shot classification. Finally, we develop an online multiple object tracker based on the combination of VOFNet and the bipartite matching. The proposed tracker achieves 49.2 multiple object tracking accuracy (MOTA) with 28.9 frames per second on MOT17 dataset, which shows a significantly better trade-off between the accuracy and the speed than the existing algorithms. For example, the proposed algorithm yields about 3.17 times faster speed with 0.99 times lower accuracy than recent existing MOT algorithm [1].

Highlights

  • Nowadays, many applications including self-driving vehicles [2], surveillance systems [3], and crowd analysis [4] require various video processing technologies such as person re-identification [5], video segmentation [6], [7] and efficient feature processing [8]

  • We introduce the concept of open-set few-shot classification

  • We introduce the notion of open-set few-shot classification to formulate online Multiple object tracking (MOT), which has the properties of both open-set classification and few-shot classification

Read more

Summary

INTRODUCTION

Many applications including self-driving vehicles [2], surveillance systems [3], and crowd analysis [4] require various video processing technologies such as person re-identification [5], video segmentation [6], [7] and efficient feature processing [8]. To train the proposed network, we perform an open-set classification based on feature distances between representative vectors and detection results. As queries are classified using only a few examples in few-shot classification, detection results are assigned object labels by a data association scheme in online MOT. From this perspective, an effective few-shot learning technique can be used for the data association, thereby improving the tracking performance.

ONLINE DATA ASSOCIATION
EXPERIMENTS
18: Update matched objects
QUALITATIVE MOT RESULTS
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call