Abstract

Multi-object tracking (MOT) is a crucial technology for security surveillance, which is computationally intensive due to the requirement of processing a large number of video streams within low latency in practice. The input video streams of MOT are processed on a cloud computing center with abundant computational capability, posing heavy pressures on delivering video streams to the cloud. Recent advances in the Internet of Things (IoT) technology provide edge-computing-based solutions for video analytics at scale. However, the gap between MOT’s high computational capability demand and IoT devices’ resource-constrained nature remains significant. In this paper, a resource-efficient multi-object tracking method (REMOT) is proposed for real-time surveillance on IoT embedded devices, including an affinity measurement based on an appearance model with angular triplet loss and a motion association that substitutes the time-consuming graph-based data association stage. Considering the trade-off between latency and accuracy, we design an optimization strategy on the parallel processing of deep learning models’ layers to accelerate the inference speed with less accuracy loss. Besides, we employ a model compression strategy for model size reduction. Experiments on MOT16 and MOT17 benchmarks demonstrate that REMOT reduces 2.4x latency compared with the original implementation and achieves a running speed of 81 frames per second (fps) on an embedded device with only a marginal accuracy loss (6%), which meets the requirements of real-time processing and low-latency response for surveillance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call