Visual Feature Learning on Video Object and Human Action Detection: A Systematic Review.

Dengshan Li,Peng Chen,Qiong Zhou,Chengjun Xie,Xiufang Jia,Rujing Wang

doi:10.3390/mi13010072

Abstract

Video object and human action detection are applied in many fields, such as video surveillance, face recognition, etc. Video object detection includes object classification and object location within the frame. Human action recognition is the detection of human actions. Usually, video detection is more challenging than image detection, since video frames are often more blurry than images. Moreover, video detection often has other difficulties, such as video defocus, motion blur, part occlusion, etc. Nowadays, the video detection technology is able to implement real-time detection, or high-accurate detection of blurry video frames. In this paper, various video object and human action detection approaches are reviewed and discussed, many of them have performed state-of-the-art results. We mainly review and discuss the classic video detection methods with supervised learning. In addition, the frequently-used video object detection and human action recognition datasets are reviewed. Finally, a summarization of the video detection is represented, e.g., the video object and human action detection methods could be classified into frame-by-frame (frame-based) detection, extracting-key-frame detection and using-temporal-information detection; the methods of utilizing temporal information of adjacent video frames are mainly the optical flow method, Long Short-Term Memory and convolution among adjacent frames.

Highlights

IntroductionVideo object detection and human action recognition are applied to various scenarios, such as the recognition of vehicle plate numbers in traffic monitoring systems, the detection of dangerous vehicle behaviors, the detection of running red lights, the detection of abnormal production behaviors in industrial production, the identification of abnormal passenger behaviors at stations and airports, etc
The approach achieves state-of-the-art in video frame synthesis
There are three ideas for the video detection: (A) the first is to detect each frame. Some algorithms, such as You Only Look Once (YOLO), can realize very fast detection speed; (B) the second is to extract the key frames, and the detection depends on the algorithm of extracting key frames; (C) the third is to use Long Short-term Memory (LSTM) structure or the optical flow method for extracting the temporal information among adjacent frames

Summary

Introduction

Video object detection and human action recognition are applied to various scenarios, such as the recognition of vehicle plate numbers in traffic monitoring systems, the detection of dangerous vehicle behaviors, the detection of running red lights, the detection of abnormal production behaviors in industrial production, the identification of abnormal passenger behaviors at stations and airports, etc. The difficulties of video detection include video defocus, motion blur, part occlusion, etc. Video defocus would be generated during the focusing process. The defocus of the video and the motion of the object may cause the video defocus and motion blur. Occlusion between objects may cause the part occlusion. The shape of the objects in the video may be changing with the distance of the camera. Compared with image detection, video detection should be more challenging

Methods

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Micromachines	Publication Date: Dec 31, 2021
Citations: 8	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Visual Feature Learning on Video Object and Human Action Detection: A Systematic Review.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Micromachines

Lead the way for us

Similar Papers

Object Detection in Video with Spatiotemporal Sampling Networks
Gedas Bertasius ... Jianbo Shi
-
Gedas Bertasius, et. al.Gedas Bertasius ... Jianbo Shi
01 Jan 2018
01 Jan 2018

Video Object Detection by Aggregating Features across Adjacent Frames
Ruyi Zhang ... Qiang Zhang
Journal of Physics: Conference Series | VOL. 1229
Ruyi Zhang, et. al.Ruyi Zhang ... Qiang Zhang
01 May 2019
Journal of Physics: Conference Series | VOL. 1229

Object detection methods on compressed domain videos: An overview, comparative analysis, and new directions
Donghai Zhai ... Changyou Ma
Measurement | VOL. 207
Donghai Zhai, et. al.Donghai Zhai ... Changyou Ma
21 Dec 2022
Measurement | VOL. 207

Single Shot Video Object Detector
Jiajun Deng ... Ting Yao
IEEE Transactions on Multimedia | VOL. 23
Jiajun Deng, et. al.Jiajun Deng ... Ting Yao
01 May 2020
IEEE Transactions on Multimedia | VOL. 23

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Visual Feature Learning on Video Object and Human Action Detection: A Systematic Review.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Micromachines