Optimized voxel transformer for 3D detection with spatial-semantic feature aggregation

Yingfei Li

doi:10.1016/j.compeleceng.2023.109023

Abstract

In this paper, we propose a novel 3D object detection model that leverages the advantages of the Voxel Transformer (VoTr) and the Confident IoU-Aware Single-Stage Object Detector (CIA-SSD) to address the challenges of detecting objects in 3D point clouds. Our model adopts the VoTr as its backbone, which enables long-range interactions between voxels via a self-attention mechanism. This overcomes the limitations of conventional voxel-based 3D detectors, which struggle to capture sufficient contextual information due to their restricted receptive fields. Our model also integrates the sparse voxel module and the submanifold voxel module, which efficiently process empty and non-empty voxel positions, effectively handling the natural sparsity and abundance of non-empty voxels. Moreover, inspired by the CIA-SSD design, our model incorporates the Spatial-Semantic Feature Aggregation (SSFA) module, which allows for the adaptive fusion of high-level abstract semantic features and low-level spatial features, ensuring accurate predictions of bounding boxes and classification confidence. Furthermore, based on the IoU-aware confidence rectification module, which refines the alignment between confidence scores and localization accuracy, we devise an Optimized RPN (Region Proposal Network) Detection Head module as a dense head to further predict the IoU loss and improve the accuracy. In this paper, we combine two state-of-the-art techniques to provide a precise and efficient solution for 3D object detection in point clouds. We evaluate our model on the KITTI dataset1 and achieve 76.56 % accuracy in terms of AP3D (%) Hard.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Optimized voxel transformer for 3D detection with spatial-semantic feature aggregation

Abstract

Talk to us

Similar Papers

More From: Computers and Electrical Engineering

Lead the way for us

Journal: Computers and Electrical Engineering	Publication Date: Nov 15, 2023
Citations: 2

Similar Papers

Attentional PointNet for 3D-Object Detection in Point Clouds
Anshul Paigwar ... Christian Laugier
-
Anshul Paigwar, et. al.Anshul Paigwar ... Christian Laugier
01 Jun 2019
01 Jun 2019

PSANet: Pyramid Splitting and Aggregation Network for 3D Object Detection in Point Cloud.
Fangyu Li ... Weizheng Jin
Sensors (Basel, Switzerland) | VOL. 21
Fangyu Li, et. al.Fangyu Li ... Weizheng Jin
28 Dec 2020
Sensors (Basel, Switzerland) | VOL. 21

Refined Voting and Scene Feature Fusion for 3D Object Detection in Point Clouds.
Hang Yu ... Jinhe Su
Computational intelligence and neuroscience | VOL. 2022
Hang Yu, et. al.Hang Yu ... Jinhe Su
29 Dec 2022
Computational intelligence and neuroscience | VOL. 2022

VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection
Yin Zhou ... Oncel Tuzel
-
Yin Zhou, et. al.Yin Zhou ... Oncel Tuzel
01 Jun 2018
01 Jun 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Optimized voxel transformer for 3D detection with spatial-semantic feature aggregation

Abstract

Talk to us

Similar Papers

More From: Computers and Electrical Engineering