REAL-TIME DEEP NEURAL NETWORKS FOR MULTIPLE OBJECT TRACKING AND SEGMENTATION ON MONOCULAR VIDEO

I Basharov,D Yudin

doi:10.5194/isprs-archives-xliv-2-w1-2021-15-2021

Abstract

Abstract. The paper is devoted to the task of multiple objects tracking and segmentation on monocular video, which was obtained by the camera of unmanned ground vehicle. The authors investigate various architectures of deep neural networks for this task solution. Special attention is paid to deep models providing inference in real time. The authors proposed an approach based on combining the modern SOLOv2 instance segmentation model, a neural network model for embedding generation for each found object, and a modified Hungarian tracking algorithm. The Hungarian algorithm was modified taking into account the geometric constraints on the positions of the found objects on the sequence of images. The investigated solution is a development and improvement of the state-of-the-art PointTrack method. The effectiveness of the proposed approach is demonstrated quantitatively and qualitatively on the popular KITTI MOTS dataset collected using the cameras of a driverless car. The software implementation of the approach was carried out. The acceleration of the procedure for the formation of a two-dimensional point cloud in the found image segment was done using the NVidia CUDA technology. At the same time, the proposed instance segmentation module provides a mean processing time of one image of 68 ms, the embedding and tracking module of 24 ms using the NVidia Tesla V100 GPU. This indicates that the proposed solution is promising for on-board computer vision systems for both unmanned vehicles and various robotic platforms.

Highlights

Multiple object tracking (MOT) task is very important for a large number of applications
The approach developed in this paper contains the following contributions: - improvements of PointTrack method (Xu et al, 2020) were proposed, which consist in replacing the basic instance segmentation model with the high-speed SOLOv2 model (Wang et al, 2020) and modifying the model that creates embedding for each a found object, taking into account its category; - modification of the Hungarian algorithm was made taking into account a geometric constraint of the found objects on an image sequence; - the software implementation of the approach was carried out, including the procedure acceleration for a two-dimensional point cloud formation in a found image segment using the NVidia CUDA technology
Experiments were performed on a workstation with CPU Intel Xeon 6154 32×3GHz, GPU NVidia TeslaV100 32GB

Summary

Introduction

Multiple object tracking (MOT) task is very important for a large number of applications. Meaning that image recognition methods in 2D are often faster than those in 3D point clouds, we chose instance segmentation on monocular video for object tracking. The approach developed in this paper contains the following contributions: - improvements of PointTrack method (Xu et al, 2020) were proposed, which consist in replacing the basic instance segmentation model with the high-speed SOLOv2 model (Wang et al, 2020) and modifying the model that creates embedding for each a found object, taking into account its category; - modification of the Hungarian algorithm was made taking into account a geometric constraint of the found objects on an image sequence; - the software implementation of the approach was carried out, including the procedure acceleration for a two-dimensional point cloud formation in a found image segment using the NVidia CUDA technology

Methods

Results

Conclusion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences	Publication Date: Apr 15, 2021
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

REAL-TIME DEEP NEURAL NETWORKS FOR MULTIPLE OBJECT TRACKING AND SEGMENTATION ON MONOCULAR VIDEO

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences

Lead the way for us

Similar Papers

Malaysia traffic sign recognition with convolutional neural network
Mian Mian Lau ... Alpha Agape Gopalai
-
Mian Mian Lau, et. al.Mian Mian Lau ... Alpha Agape Gopalai
01 Jul 2015
01 Jul 2015

Traffic Sign Recognition on Video Sequence using Deep Neural Networks and Matching Algorithm
Ilya Belkin ... Sergey Tkachenko
-
Ilya Belkin, et. al.Ilya Belkin ... Sergey Tkachenko
01 Sep 2019
01 Sep 2019

Selecting Architecture and Parameters of Deep Neural Networks for Computer Attack Classification
O.S Amosov ... D.S Magola
-
O.S Amosov, et. al.O.S Amosov ... D.S Magola
06 Oct 2020
06 Oct 2020

Deep Neural Networks for ECG-Based Pulse Detection during Out-of-Hospital Cardiac Arrest.
Andoni Elola ... Pamela Owens
Entropy | VOL. 21
Andoni Elola, et. al.Andoni Elola ... Pamela Owens
21 Mar 2019
Entropy | VOL. 21

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

REAL-TIME DEEP NEURAL NETWORKS FOR MULTIPLE OBJECT TRACKING AND SEGMENTATION ON MONOCULAR VIDEO

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences