3D Object Localization Research Articles

Accurate recognition and localization of 3D objects is a fundamental research problem in 3D computer vision. Benefiting from transformation-free point cloud processing and flexible receptive fields, point-based methods have become accurate in 3D point cloud modeling, but still fall behind voxel-based competitors in 3D detection. We observe that the set abstraction module, commonly utilized by point-based methods for downsampling points, tends to retain excessive irrelevant background information, thus hindering the effective learning of features for object detection tasks. To address this issue, we propose MSSA, a Multi-representation Semantics-augmented Set Abstraction for 3D object detection. Specifically, we first design a backbone network to encode different representation features of point clouds, which extracts point-wise features through PointNet to preserve fine-grained geometric structure features, and adopts VoxelNet to extract voxel features and BEV features to enhance the semantic features of key points. Second, to efficiently fuse different representation features of keypoints, we propose a Point feature-guided Voxel feature and BEV feature fusion (PVB-Fusion) module to adaptively fuse multi-representation features and remove noise. At last, a novel Multi-representation Semantic-guided Farthest Point Sampling (MS-FPS) algorithm is designed to help set abstraction modules progressively downsample point clouds, thereby improving instance recall and detection performance with more important foreground points. We evaluate MSSA on the widely used KITTI dataset and the more challenging nuScenes dataset. Experimental results show that compared to PointRCNN, our method improves the AP of “moderate” level for three classes of objects by 7.02%, 6.76%, and 5.44%, respectively. Compared to the advanced point-voxel-based method PV-RCNN, our method improves the AP of “moderate” level by 1.23%, 2.84%, and 0.55% for the three classes, respectively.

Object perception plays a fundamental role in Cooperative Driving Automation (CDA) which is regarded as a revolutionary promoter for next-generation transportation systems. However, the vehicle-based perception may suffer from the limited sensing range and occlusion as well as low penetration rates in connectivity. In this paper, we propose <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Cyber Mobility Mirror</i> ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">CMM</i> ), a next-generation real-world object perception system for 3D object detection, tracking, localization, and reconstruction, to explore the potential of roadside sensors for enabling CDA in the real world. The CMM system consists of six main components: i) the data pre-processor to retrieve and preprocess the raw data; ii) the roadside 3D object detector to generate 3D detection results; iii) the multi-object tracker to identify detected objects; iv) the global locator to generate geo-localization information; v) the mobile-edge-cloud-based communicator to transmit perception information to equipped vehicles, and vi) the onboard advisor to reconstruct and display the real-time traffic conditions. An automatic perception evaluation approach is proposed to support the assessment of data-driven models without human-labeling requirements and a CMM field-operational system is deployed at a real-world intersection to assess the performance of the CMM. Results from field tests demonstrate that our CMM prototype system can achieve 96.99% precision and 83.62% recall for detection and 73.55% ID-recall for tracking. High-fidelity real-time traffic conditions (at the object level) can be geo-localized with a root-mean-square error (RMSE) of <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$0.69m$</tex-math> </inline-formula> and <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$0.33m$</tex-math> </inline-formula> for lateral and longitudinal direction, respectively, and displayed on the GUI of the equipped vehicle with a frequency of <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$3-4 Hz$</tex-math> </inline-formula> .

3D Object Localization Research Articles

Related Topics

Articles published on 3D Object Localization

MSSA: Multi-Representation Semantics-Augmented Set Abstraction for 3D Object Detection

LiDAR-Based Intensity-Aware Outdoor 3D Object Detection.

Vision-Based Object Manipulation for Activities of Daily Living Assistance Using Assistive Robot

MMoNet3D: Enhancing monocular 3D object localization with modern advancements

Mono3DVG: 3D Visual Grounding in Monocular Images

MonoAux: Fully Exploiting Auxiliary Information and Uncertainty for Monocular 3D Object Detection.

Vision-Based Object Localization and Classification for Electric Vehicle Driving Assistance

Cyber Mobility Mirror: A Deep Learning-Based Real-World Object Perception Platform Using Roadside LiDAR

3D integral imaging depth estimation of partially occluded objects using mutual information and Bayesian optimization.

3D Object Detection via 2D Segmentation-Based Computational Integral Imaging Applied to a Real Video.

Monocular 3D Object Detection Based on Pseudo Multimodal Information Extraction and Keypoint Estimation

Object Recognition and Localization for Pick-and-Place Task using Difference-based Dynamic Movement Primitives

Monopulse ladar: super-resolved 3D localization with Si-photonic serpentine optical phased arrays

Object Localization Assistive System Based on CV and Vibrotactile Encoding.

An efﬁcient regression method for 3D object localization in machine vision systems

End-to-End Optimization of LiDAR Beam Configuration for 3D Object Detection and Localization

Improved Indoor 3D Localization using LoRa Wireless Communication

3D Object Recognition and Localization with a Dense LiDAR Scanner

Kimera: From SLAM to spatial perception with 3D dynamic scene graphs

A Fast and Robust Rotation Search and Point Cloud Registration Method for 2D Stitching and 3D Object Localization

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

3D Object Localization Research Articles

Related Topics

Articles published on 3D Object Localization

MSSA: Multi-Representation Semantics-Augmented Set Abstraction for 3D Object Detection

LiDAR-Based Intensity-Aware Outdoor 3D Object Detection.

Vision-Based Object Manipulation for Activities of Daily Living Assistance Using Assistive Robot

MMoNet3D: Enhancing monocular 3D object localization with modern advancements

Mono3DVG: 3D Visual Grounding in Monocular Images

MonoAux: Fully Exploiting Auxiliary Information and Uncertainty for Monocular 3D Object Detection.

Vision-Based Object Localization and Classification for Electric Vehicle Driving Assistance

Cyber Mobility Mirror: A Deep Learning-Based Real-World Object Perception Platform Using Roadside LiDAR

3D integral imaging depth estimation of partially occluded objects using mutual information and Bayesian optimization.

3D Object Detection via 2D Segmentation-Based Computational Integral Imaging Applied to a Real Video.

Monocular 3D Object Detection Based on Pseudo Multimodal Information Extraction and Keypoint Estimation

Object Recognition and Localization for Pick-and-Place Task using Difference-based Dynamic Movement Primitives

Monopulse ladar: super-resolved 3D localization with Si-photonic serpentine optical phased arrays

Object Localization Assistive System Based on CV and Vibrotactile Encoding.

An efﬁcient regression method for 3D object localization in machine vision systems

End-to-End Optimization of LiDAR Beam Configuration for 3D Object Detection and Localization

Improved Indoor 3D Localization using LoRa Wireless Communication

3D Object Recognition and Localization with a Dense LiDAR Scanner

Kimera: From SLAM to spatial perception with 3D dynamic scene graphs

A Fast and Robust Rotation Search and Point Cloud Registration Method for 2D Stitching and 3D Object Localization