Temporal-Channel Transformer for 3D Lidar-Based Video Object Detection for Autonomous Driving

Zhenxun Yuan,Wanli Ouyang,Zhe Wang,Lei Bai,Xiao Song

doi:10.1109/tcsvt.2021.3082763

Abstract

The strong demand of autonomous driving in the industry has led to vigorous interest in 3D object detection and resulted in many excellent 3D object detection algorithms. However, the vast majority of algorithms only model single-frame data, ignoring the temporal clue in video sequence. In this work, we propose a new transformer, called Temporal-Channel Transformer (TCTR), to model the temporal-channel domain and spatial-wise relationships for video object detecting from Lidar data. As the special design of this transformer, the information encoded in the encoder is different from that in the decoder. The encoder encodes temporal-channel information of multiple frames while the decoder decodes the spatial-wise information for the current frame in a voxel-wise manner. Specifically, the temporal-channel encoder of the transformer is designed to encode the information of different channels and frames by utilizing the correlation among features from different channels and frames. On the other hand, the spatial decoder of the transformer decodes the information for each location of the current frame. Before conducting the object detection with detection head, a gate mechanism is further deployed for re-calibrating the features of current frame, which filters out the object-irrelevant information by repetitively refining the representation of target frame along with the up-sampling process. Experimental results reveal that TCTR achieves the state-of-the-art performance in grid voxel-based 3D object detection on the nuScenes benchmark.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Temporal-Channel Transformer for 3D Lidar-Based Video Object Detection for Autonomous Driving

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society

Lead the way for us

Journal: IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society	Publication Date: May 22, 2021
Citations: 83

Similar Papers

Recent Advances in 3D Object Detection in the Era of Deep Neural Networks: A Survey.
Mohammad Muntasir Rahman ... Ke Lu
IEEE Transactions on Image Processing | VOL. 29
Mohammad Muntasir Rahman, et. al.Mohammad Muntasir Rahman ... Ke Lu
28 Nov 2019
IEEE Transactions on Image Processing | VOL. 29

A Comprehensive Review on 3D Object Detection and 6D Pose Estimation With Deep Learning
Sabera Hoque ... Ananda Maiti
IEEE access : practical innovations, open solutions | VOL. 9
Sabera Hoque, et. al.Sabera Hoque ... Ananda Maiti
01 Jan 2020
IEEE access : practical innovations, open solutions | VOL. 9

Monocular 3D object detection for construction scene analysis
Jie Shen ... Cong Zhang
Computer-Aided Civil and Infrastructure Engineering | VOL. -
Jie Shen, et. al.Jie Shen ... Cong Zhang
20 Dec 2023
Computer-Aided Civil and Infrastructure Engineering | VOL. -

Multi-sensor fusion 3D object detection for autonomous driving
Simegnew Alaba ... Michael C Dudzik
-
Simegnew Alaba, et. al.Simegnew Alaba ... Michael C Dudzik
13 Jun 2023
13 Jun 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Temporal-Channel Transformer for 3D Lidar-Based Video Object Detection for Autonomous Driving

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society