Abstract

Artificial intelligence (AI), combined with the Internet of Things (IoT), plays a beneficial role in various fields, including intelligent surveillance applications. With IoT and 5G advancement, intelligent sensors, and devices in the surveillance environment collect large amounts of data in the form of videos and images. These collected data require intelligent information processing solutions, help analyze the recorded videos and images to detect and identify various objects in the scene, particularly humans. In this study, an automated human detection system is presented for a complex industrial environment, in which people are monitored/detected from a top view perspective. A top view is usually preferred because it can provide sufficient coverage and enough visibility of a scene. This study demonstrates the applications, efficiency, and effectiveness of deep learning architectures, that is, Faster Region Convolutional Neural Network (Faster R-CNN), Single Shot MultiBox Detector (SSD), and You Only Look Once (YOLOv3), with transfer learning. Experimental results reveal that with additional training and transfer learning, the performance of all detection architectures is significantly improved. The detection results are also compared using the same data set. The deep learning architectures achieve promising results with maximum true-positive rate of 93%, 94%, and 94% for Faster-RCNN, SSD, and YOLOv3, respectively. Furthermore, a detailed study is performed on output results that highlight challenges and probable future trends.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call