Abstract

ABSTRACT Multi-point tracking is a challenging task that involves detecting points in the scene and tracking them across time. Here, metrics from Multi-object tracking (MOT) methods are shown to perform better than frame-based F-measures. The recently proposed HOTA metric, used for benchmarks such as the KITTI dataset, better evaluates the performance over metrics like MOTA, DetA, and IDF1. While HOTA takes into account temporal associations, it does not provide a tailored means to analyse the spatial associations of a dataset in a multi-camera setup. Moreover, there are differences in evaluating the detection task for points vs. objects (point distances vs. bounding box overlap). Therefore, we propose a multi-view higher-order tracking metric mvHOTA ,to determine the accuracy of multi-point (multi-instance and multi-class) tracking methods while taking into account temporal and spatial associations. We demonstrate its use in evaluating the tracking performance on an endoscopic point detection dataset from a previously organised surgical data science challenge. Furthermore, we compare with other adjusted MOT metrics for this use-case, discuss the properties of mvHOTA, and show how the proposed multi-view Association and the Occlusion index (OI) facilitate analysis of methods with respect to handling of occlusions. The code is available at https://github.com/Cardio-AI/mvhota.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call