Deep Edge Computing for Videos

Jun-Hwa Kim,Namho Kim,Chee Sun Won

doi:10.1109/access.2021.3109904

Abstract

This paper provides a modular architecture with deep neural networks as a solution for real-time video analytics in an edge-computing environment. The modular architecture consists of two networks of Front-CNN (Convolutional Neural Network) and Back-CNN, where we adopt Shallow 3D CNN (S3D) as the Front-CNN and a pre-trained 2D CNN as the Back-CNN. The S3D (i.e., the Front CNN) is in charge of condensing a sequence of video frames into a feature map with three channels. That is, the S3D takes a set of sequential frames in the video shot as input and yields a learned 3 channel feature map (3CFM) as output. Since the 3CFM is compatible with the three-channel RGB color image format, we can use the output of the S3D (i.e., the 3CFM) as the input to a pre-trained 2D CNN of the Back-CNN for the transfer-learning. This serial connection of Front-CNN and Back-CNN architecture is end-to-end trainable to learn both spatial and temporal information of videos. Experimental results on the public datasets of UCF-Crime and UR-Fall Detection show that the proposed S3D-2DCNN model outperforms the existing methods and achieves state-of-the-art performance. Moreover, since our Front-CNN and Back-CNN modules have a shallow S3D and a light-weighted 2D CNN, respectively, it is suitable for real-time video recognition in edge-computing environments. We have implemented our CNN model on NVIDIA Jetson Nano Developer as an edge-computing device to show its real-time execution.

Highlights

S URVEILLANCE cameras have been increasingly deployed in public places for the purpose of monitoring abnormal events such as criminal activities and medical emergencies [1], [2]
The traditional anomaly detection mainly relied on the motion information between two consecutive frames extracted by optical flow [4] or dynamic Bayesian Network (DBN) [5]
The Convolutional 3D (C3D) learns the temporal motions as well as the spatial features from video frames. This requires for the C3D to execute complex 3D convolutions with the kernel dimension of Rc×d×d×T, where c is the number of channels, d is the spatial size (i.e., d × d) of the filter, and T is the number of frames in the video clips

Summary

Introduction

S URVEILLANCE cameras have been increasingly deployed in public places for the purpose of monitoring abnormal events such as criminal activities and medical emergencies [1], [2]. The C3D learns the temporal motions as well as the spatial features from video frames. A pre-trained CNN is fine-tuned by 3 grayscale frames, which are subsampled from a video shot. The SG3Is formed from the training videos are used to fine-tune the pre-trained 2D CNN to learn the motion. Many algorithms [4], [5], [13]–[15] have been developed to handle vast amounts of data automatically These algorithms can be used for the video recognition in a cloud server. A violence detection [16] was performed by transmitting the video data obtained from the drone camera to the cloud server. By transmitting the road video obtained from the camera to the cloud server, the license plate of the vehicle was extracted [17]

Objectives

Methods

Findings

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2021
Citations: 10	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Deep Edge Computing for Videos

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Application and Evaluation of Deep Neural Networks for Airborne Hyperspectral Remote Sensing Mineral Mapping: A Case Study of the Baiyanghe Uranium Deposit in Northwestern Xinjiang, China
Chuan Zhang ... Qingjun Xu
Remote Sensing | VOL. 14
Chuan Zhang, et. al.Chuan Zhang ... Qingjun Xu
13 Oct 2022
Remote Sensing | VOL. 14

3D-D2D: An Efficient Hybrid Model For Hyperspectral Image Classification
Sohan K R ... Satish Kumar Singh
-
Sohan K R, et. al.Sohan K R ... Satish Kumar Singh
25 Aug 2022
25 Aug 2022

An Interactive Visualization for Feature Localization in Deep Neural Networks.
Martin Zurowietz ... Tim W Nattkemper
Frontiers in Artificial Intelligence | VOL. 3
Martin Zurowietz, et. al.Martin Zurowietz ... Tim W Nattkemper
23 Jul 2020
Frontiers in Artificial Intelligence | VOL. 3

Deep distributed convolutional neural networks: Universality
Ding-Xuan Zhou
Analysis and Applications | VOL. 16
Ding-Xuan ZhouDing-Xuan Zhou
01 Nov 2018
Analysis and Applications | VOL. 16

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Deep Edge Computing for Videos

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access