Abstract

This paper proposes a method exploiting temporal context with an attention mechanism for detecting objects in real-time in a live streaming video. Video object detection is challenging and essential in practical applications such as robotics, smartphones, and surveillance cameras. Although methods have been proposed to improve the accuracy or run-time speed by exploiting temporal information, the trade-off between them tends to be ignored. We thus focus on the trade-off between accuracy and speed, and propose a method to improve the accuracy by aggregating the past information from a lightweight feature extractor with an attention mechanism. Evaluations on the UA-DETRAC and ImageNet VID datasets demonstrate our model’s superior performance to state-of-the-art methods on live streaming real-time object detection.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call