Abstract

With image object detection techniques progressing, video object detection attracts a lot of attention more than ever before. However, the performance of detecting the object from a video suffers a lot as object occlusion can always lead to the object in the presence of appearance deterioration. To deal with this problem, the temporal correlation of video sequences is often used to reduce the effects of object occlusion in video object detection. In this paper, an optical-flow-feature fusion-based video object detection method is proposed with consideration of temporal coherence among video frames. To further reduce computational complexity, this paper also proposes a packet video processing method. Specifically, video frames are grouped first, and all frames in current group share the same optical flow feature map by feature fusion. Then the Target Image can be formed with object information enhanced by fusing the shared feature map with current frame. The proposed method gives rise to effective background information masking, so that the object detection network can focus more on the foreground object. This method effectively improves the object detection performance and the scene migration performance. Experimental results prove that the proposed method significantly improves the detection accuracy to 78.6% on ImageNet VID. In addition, VGG and ResNet are compared to further verify the effectiveness of the proposed method, and the results can be a persuasive evidence with the highest detection accuracy and acceptable time consumption.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call