Abstract

Multi-object tracking in video surveillance is subjected to illumination variation, blurring, motion, and similarity variations during the identification process in real-world practice. The previously proposed applications have difficulties in learning the appearances and differentiating the objects from sundry detections. They mostly rely heavily on local features and tend to lose vital global structured features such as contour features. This contributes to their inability to accurately detect, classify or distinguish the fooling images. In this paper, we propose a paradigm aimed at eliminating these tracking difficulties by enhancing the detection quality rate through the combination of a convolutional neural network (CNN) and a histogram of oriented gradient (HOG) descriptor. We trained the algorithm with an input of 120 × 32 images size and cleaned and converted them into binary for reducing the numbers of false positives. In testing, we eliminated the background on frames size and applied morphological operations and Laplacian of Gaussian model (LOG) mixture after blobs. The images further underwent feature extraction and computation with the HOG descriptor to simplify the structural information of the objects in the captured video images. We stored the appearance features in an array and passed them into the network (CNN) for further processing. We have applied and evaluated our algorithm for real-time multiple object tracking on various city streets using EPFL multi-camera pedestrian datasets. The experimental results illustrate that our proposed technique improves the detection rate and data associations. Our algorithm outperformed the online state-of-the-art approach by recording the highest in precisions and specificity rates.

Highlights

  • The visualization and tracking of multiple objects in surveillance applications are enormously dominating topics in computer vision’s security field

  • We propose to build a new model by combining the histogram of oriented gradient (HOG) descriptors and a traditional convolutional neural network (CNN) to form an HCNN algorithm for tracking multi-object across non-overlapping cameras

  • The algorithm has proven to be effective with high performance in precision and recall, accompanied by the high confidence values on the campus scene dataset

Read more

Summary

Introduction

The visualization and tracking of multiple objects in surveillance applications are enormously dominating topics in computer vision’s security field. They continue to suffer from identifying the shape and boundary characteristics from the captured images [7] This contributes to their incapability for handling the detection accuracy on light, appearance distortion, deformation, and motion-blurred images. Other studies tried to eliminate this grey area by exploiting the HOG descriptor technique and recorded satisfactory results but suffered from the speed and classification of huge samples during the training phase [9]. To ensure both contour and global features are effectively incorporated into the neural network to represent a human-like system. This paper is arranged into five sections: Section 1 introduces the background, Section 2 details the related work, Section 3 describes details of our approach, Section 4 presents experimental results, Section 5 discusses an interpretation of the results and comparison with state-of-the-art algorithms, and Section 6 concludes the paper

Related Works
Proposed HCNN for Real-Time MOT
Background Segmenting Modeling
Foreground Blobs Windowing Modeling
HOG Descriptor’s Features Extraction
Structure of the Convolutional Neural Network
Designing
Designing for Our HCNN
Experimental Setup
Comparison
Results
Shows bothepochs training and validations losses ofThe the precision
Benchmark Evaluation Results
Method
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.