BSPR-Net: Dual-Branch Feature Extraction Network for LiDAR Place Recognition in Unstructured Environments
LiDAR point cloud-based place recognition (LPR) in unstructured natural environments remains an open challenge with limited existing research. To address the limitations of unstructured environments, such as sparse structural features, uneven point cloud density, and significant viewpoint variations, we present BSPR-Net, a dual-branch point cloud feature extraction approach for point cloud place recognition, which consists of a BEV - projection rotation - invariant convolution branch and a point cloud sparse convolution branch. This design enhances the representation capability of geometric structural features while aggregating rotation-invariant characteristics of point clouds, thereby better addressing the challenge of large viewpoint disparities in reverse-revisited unstructured environments. The proposed network was tested on multiple reverse-revisited sequences of the Wild-Places data set, a benchmark for unstructured natural environment place recognition. It achieved a maximum F1 score of 85.46 %, exceeding other classical methods by more than 4 %. The ablation experiments further confirmed the effectiveness of each module in improving place recognition performance.
- Conference Article
3
- 10.1109/vcip56404.2022.10008830
- Dec 13, 2022
Place recognition task is a crucial part of 3D scene recognition in various applications. Nowadays, learning-based point cloud place recognition approaches have achieved remarkable success. However, these methods seldom consider the possible rotation of point cloud data in large-scale real-world place recognition tasks. To cope with this problem, in this work, we propose a novel effective rotation invariant network for large-scale place recognition named ERINet, which captures the recent successful deep network architecture and benefits from holding the rotation-invariant property of point clouds. In this network, we design a core effective rotation invariant module, which enhances the ability to extract rotation-invariant features of 3D point clouds. The benchmark experiments illustrate that our network boosts the performance of the recent works on all evaluation metrics with various rotations, even under challenging rotation cases.
- Conference Article
- 10.1109/cac53003.2021.9728443
- Oct 22, 2021
3D point cloud-based place recognition has gotten more attention since 3D LiDAR sensors are widely used for robotic applications and autonomous driving. Most of the existing deep point cloud-based methods take a few regular number points or image-like formats as inputs which are inability to make full use of point clouds’ geometric information. This paper proposes a novel place recognition approach that is flexible and effective to handle diverse numbers of 3D LiDAR point clouds in large-scale environments. The approach is composed of feature extraction and a global descriptor encoding. The feature extraction consumes the 3D LiDAR point cloud with KPConv that can extract features efficiently and flexibly. Before the global descriptor encoding, a transformer module is employed to aggregate the contextual information that exploits the relationship of all features. The NetVLAD layer encodes the features into a global descriptor for recognizing a similar place rapidly. The proposed approach is evaluated on the KITTI odometry dataset, which demonstrates the validity of the proposed approach.
- Research Article
3
- 10.1016/j.eswa.2024.123996
- Apr 19, 2024
- Expert Systems with Applications
LGD: A fast place recognition method based on the fusion of local and global descriptors
- Conference Article
99
- 10.1109/iros45743.2020.9341060
- Oct 24, 2020
Due to the difficulty in generating the effective descriptors which are robust to occlusion and viewpoint changes, place recognition for 3D point cloud remains an open issue. Unlike most of the existing methods that focus on extracting local, global, and statistical features of raw point clouds, our method aims at the semantic level that can be superior in terms of robustness to environmental changes. Inspired by the perspective of humans, who recognize scenes through identifying semantic objects and capturing their relations, this paper presents a novel semantic graph based approach for place recognition. First, we propose a novel semantic graph representation for the point cloud scenes by reserving the semantic and topological information of the raw point cloud. Thus, place recognition is modeled as a graph matching problem. Then we design a fast and effective graph similarity network to compute the similarity. Exhaustive evaluations on the KITTI dataset show that our approach is robust to the occlusion as well as viewpoint changes and outperforms the state-of-the-art methods with a large margin. Our code is available at: https://github.com/kxhit/SG_PR.
- Conference Article
4
- 10.1145/3448823.3448865
- Dec 9, 2020
Negative obstacles like pits and ditches are common distributed in the unstructured road environment. For autonomous navigation of unmanned ground vehicle, negative obstacle avoidance is very important but difficult, as there are always bumpy roads during movement and the perception results are usually unstable. We proposed a negative obstacle detection method based on 3D-Lidar perception. Negative obstacles are divided into two types: negative obstacles within the road plane and negative obstacles on both sides of the road. For the former, point clouds scanned to the posterior wall of negative obstacles are used to extract geometric features. For the latter, the gap between valid point clouds are an important basis for negative obstacle detection. Besides, virtual point clouds are used to construct the area over negative obstacles. Meanwhile we do a lot of work to reduce false detection, as there are many small potholes on the rough roads which are not dangerous for driving. Experiments show that the proposed detection method is very effective and has good robustness in unstructured environment.
- Research Article
91
- 10.1109/tits.2019.2905046
- Jun 27, 2019
- IEEE Transactions on Intelligent Transportation Systems
Global localization in 3D point clouds is a challenging task for mobile vehicles in outdoor scenarios, which requires the vehicle to localize itself correctly in a given map without prior knowledge of its pose. This is a critical component of autonomous vehicles or robots on the road for handling localization failures. In this paper, based on reduced dimension scan representations learned from neural networks, a solution to global localization is proposed by achieving place recognition first and then metric pose estimation in the global prior map. Specifically, we present a semi-handcrafted feature learning method for 3D Light detection and ranging (LiDAR) point clouds using artificial statistics and siamese network, which transforms the place recognition problem into a similarity modeling problem. Additionally, the sensor data using dimension reduced representations require less storage space and make the searching easier. With the learned representations by networks and the global poses, a prior map is built and used in the localization framework. In the localization step, position only observations obtained by place recognition are used in a particle filter algorithm to achieve precise pose estimation. To demonstrate the effectiveness of our place recognition and localization approach, KITTI benchmark and our multi-session datasets are employed for comparison with other geometric-based algorithms. The results show that our system can achieve both high accuracy and efficiency for long-term autonomy.
- Research Article
2
- 10.3390/s22228604
- Nov 8, 2022
- Sensors (Basel, Switzerland)
Place recognition is an essential part of simultaneous localization and mapping (SLAM). LiDAR-based place recognition relies almost exclusively on geometric information. However, geometric information may become unreliable when faced with environments dominated by unstructured objects. In this paper, we explore the role of segmentation for extracting key structured information. We propose STV-SC, a novel segmentation and temporal verification enhanced place recognition method for unstructured environments. It contains a range image-based 3D point segmentation algorithm and a three-stage process to detect a loop. The three-stage method consists of a two-stage candidate loop search process and a one-stage segmentation and temporal verification (STV) process. Our STV process utilizes the time-continuous feature of SLAM to determine whether there is an occasional mismatch. We quantitatively demonstrate that the STV process can trigger false detections caused by unstructured objects and effectively extract structured objects to avoid outliers. Comparison with state-of-art algorithms on public datasets shows that STV-SC can run online and achieve improved performance in unstructured environments (Under the same precision, the recall rate is 1.4∼16% higher than Scan context). Therefore, our algorithm can effectively avoid the mismatching caused by the original algorithm in unstructured environment and improve the environmental adaptability of mobile agents.
- Conference Article
7
- 10.1109/icpair.2011.5976919
- Jun 1, 2011
Vision Based qualitative localization or in the other word place recognition is an important perceptual problem at the center of several fundamental robot procedures. Place recognition approaches are utilized to solve the “global localization” problem. These methods are typically performed in a supervised mode. In this paper an appearance-based unsupervised place clustering and recognition algorithm are introduced. This method fuses several image features using Speedup Robust Features (SURF) by agglomerating them into the union form of features inside each place cluster. The number of place clusters can be extracted by investigating the SURF based scene similarity diagram between adjacent images. Experimental results show that this method is robust, accurate, efficient and able to create topological place clusters for solving the “global localization” problem with acceptable performance by the factor of clustering error and recognition precision.
- Research Article
28
- 10.1177/0142331217739953
- Feb 14, 2018
- Transactions of the Institute of Measurement and Control
Remote manipulation of a robot without assistance in an unstructured environment is a challenging task for operators. In this paper, a novel methodology for haptic constraints in a point cloud augmented virtual reality environment is proposed to address this human operation limitation. The proposed method generates haptic constraints in real time for an unstructured environment, including regional constraints and guidance constraints. A modified implicit surface method is applied for regional constraint generation for the entire point cloud. Additionally, the isosurface derived from the implicit surface is proposed for real-time three-dimensional artificial force field estimation. For guidance constraint generation, a new incremental prediction and local artificial force field generation method based on the modified sigmoid model is proposed in an unstructured point cloud virtual reality environment. With the generated haptic constraints, the operator can control the robot to realize obstacle avoidance and easily reach the target tasks. System evaluation is conducted, and the result demonstrates the effectiveness of the proposed method. In addition, a 10-participant study with users who control the robot to three specific targets shows that the system can enhance human operation efficiency and reduce time costs by at least 59% compared with no-haptic-constraint operations. Additionally, the designed questionnaire also demonstrates that the proposed methodology can reduce the workload during human operations.
- Research Article
13
- 10.1109/tase.2021.3128639
- Oct 1, 2022
- IEEE Transactions on Automation Science and Engineering
To realize a robust robotic grasping system for unknown objects in an unstructured environment, large amounts of grasp data and 3D model data for the object are required; the sizes of these data directly affect the rate of successful grasps. To reduce the time cost of data acquisition and labeling and increase the rate of successful grasps, we developed a self-supervised learning mechanism to control grasp tasks performed by manipulators. First, a manipulator automatically collects the point cloud for the objects from multiple perspectives to increase the efficiency of data acquisition. The complete point cloud for the objects is obtained using the hand-eye vision of the manipulator and the truncated signed distance function algorithm. Then, the point cloud data for the objects are used to generate a series of six-degrees-of-freedom grasp poses, and the force-closure decision algorithm is used to add the grasp quality label to each grasp pose to realize the automatic labeling of grasp data. Finally, the point cloud in the gripper closing area corresponding to each grasp pose is obtained and used to train the grasp-quality classification model for the manipulator. The results of performing actual grasping experiments demonstrate that the proposed self-supervised learning method can increase the rate of successful grasps for the manipulator. Note to Practitioners—Most of the existing grasp planning methods of the manipulator are based on public datasets or simulation data to train model algorithms. Owing to the limited types of objects, the limited amount of data in the public datasets, and the lack of real sensor noise in the simulation data, the robustness of the trained algorithm model is insufficient, and it is difficult to apply to unstructured production environments. To solve the above problems, we propose a 6-DOF capture planning method based on self-supervised learning and introduce a self-supervised learning mechanism to solve the problem of grasp data acquisition in real scenes. The manipulator automatically collects object data from multiple perspectives, performs desktop-level 3D reconstruction, and finally uses the force-closure decision algorithm to automatically label the data in order to realize automatic acquisition and labeling of the grasp data in a real scenario. Preliminary experiments show that this method can obtain high-quality grasp data and can be applied to grasp operations in real multi-target and cluttered environments. However, it has not been tested in actual production environments. This paper focuses on the data acquisition module in the 6-DOF grasp planning framework. In future research, we will design a more efficient grasp planning module to improve the grasp efficiency of the manipulator.
- Conference Article
3
- 10.1109/iros51168.2021.9635915
- Sep 27, 2021
Future planetary missions will rely on rovers that can autonomously explore and navigate in unstructured environments. An essential element is the ability to recognize places that were already visited or mapped. In this work, we leverage the ability of stereo cameras to provide both visual and depth information, guiding the search and validation of loop closures from a multi-modal perspective. We propose to augment submaps that are created by aggregating stereo point clouds, with visual keyframes. Point clouds matches are found by comparing CSHOT descriptors and validated by clustering, while visual matches are established by comparing keyframes using Bag-of-Words (BoW) and ORB descriptors. The relative transformations resulting from both keyframe and point cloud matches are then fused to provide pose constraints between submaps in our graph-based SLAM framework. Using the LRU rover, we performed several tests in both an indoor laboratory environment as well as a challenging planetary analog environment on Mount Etna, Italy, consisting of areas where either keyframes or point clouds alone failed to provide adequate matches demonstrating the benefit of the proposed multi-modal approach.
- Research Article
14
- 10.1109/tim.2022.3209727
- Jan 1, 2022
- IEEE Transactions on Instrumentation and Measurement
In dynamic environments, sensor occlusions and viewpoint changes occur frequently, leading to challenges for point -based place recognition retrieval. Existing deep learning-based methods are impossible to possess the strengths of high detection accuracy, small network model, and rapid detection simultaneously, making them inapplicable to real-life situations. In this paper, we propose an efficient 3D point cloud place recognition approach based on feature point extraction and transformer (FPET-Net) to improve the detection effect of place recognition and reduce the model computation. We first introduce a feature point extraction module, which can greatly reduce the size of the point cloud and preserve the data features, further reducing the impact of environmental changes on data acquisition. Then, a point transformer module is developed to control the computational effort while extracting the global descriptors by discriminative properties. Finally, a feature similarity network module computes the global descriptor similarity using a bilinear tensor layer with lower parameters correlated across latitudes. Experiments show that the parameters of our algorithm are 2.7 times smaller than the previous lightest EPC-Net, and the computation speed of one frame point cloud is 4.3 times faster. The network also achieves excellent results with a maximum F <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> score of 0.975 in place recognition experiments based on the KITTI dataset.
- Research Article
4
- 10.1002/int.23087
- Sep 22, 2022
- International Journal of Intelligent Systems
The feature analysis of point clouds, a popular representation of three-dimensional (3D) objects, is rising as a hot research topic nowadays. Point cloud data bear a sparse and unordered nature, making many commonly used feature extraction methods, for example, Convolutional Neural Networks (CNNs) inapplicable, while previous models suitable for the task are usually complex. We aim to reduce model complexity by reducing the number of parameters while achieving better (or at least comparable) performance. We propose an Interpolation Graph Convolutional Network (IGCN) for extracting features of point clouds. IGCN uses the point cloud graph structure and a specially designed Interpolation Convolution Kernel to mimic the operations of CNN for feature extraction. On the basis of weight postfusion and multilevel-resolution aggregation, IGCN not only reduces the cost of calculating the interpolation operation but also improves the model's performance. We validate the performance of IGCN on both point cloud classification and segmentation tasks and explore the contribution of each module of our model through ablation experiments. Furthermore, we embed the IGCN point cloud feature extraction module as a plug-and-play module into other frameworks and perform point cloud registration experiments.
- Conference Article
1
- 10.1109/urai.2017.7992746
- Jun 1, 2017
Place recognition is widely used in the loop closure detection in SLAM. The current approach to place recognition is based on RGB images, but there are relatively few place recognition studies using a point cloud. This study presents the place recognition method based on the surface graph. The proposed method clusters the surfaces in the point cloud and recognizes a place through a surface descriptor and a surface graph. The advantage of this approach is that it uses the surfaces that are not low-level features such as SIFT and SURF. Another advantage is that the proposed place recognition is robust because of the surface graph. We have experimented on the data set obtained by the mobile robot equipped with a Kinect sensor in the indoor environment. The experimental results show that the proposed place recognition based on the surface graph (PRSG) scheme is useful and can be used as a loop closure detector.
- Research Article
78
- 10.1109/tim.2012.2216475
- Feb 1, 2013
- IEEE Transactions on Instrumentation and Measurement
Active environment perception and autonomous place recognition play a key role for mobile robots to operate within a cluttered indoor environment with dynamic changes. This paper presents a 3-D-laser-based scene measurement technique and a novel place recognition method to deal with the random disturbances caused by unexpected movements of people and other objects. The proposed approach can extract and match the Speeded-Up Robust Features (SURFs) from bearing-angle images generated by a self-built rotating 3-D laser scanner. It can cope with the irregular disturbance of moving objects and the problem of observing-location changes of the laser scanner. Both global metric information and local SURF features are extracted from 3-D laser point clouds and 2-D bearing-angle images, respectively. A large-scale indoor environment with over 1600 m <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> and 30 offices is selected as a testing site, and a mobile robot, i.e., SmartROB2, is deployed for conducting experiments. Experimental results show that the proposed 3-D-laser-based scene measurement technique and place recognition approach are effective and provide robust performance of place recognition in a dynamic indoor environment.
- Research Article
- 10.5755/j02.eie.40003
- Jun 27, 2025
- Elektronika ir Elektrotechnika
- Research Article
- 10.5755/j02.eie.40870
- Jun 27, 2025
- Elektronika ir Elektrotechnika
- Research Article
- 10.5755/j02.eie.40836
- Jun 27, 2025
- Elektronika ir Elektrotechnika
- Research Article
- 10.5755/j02.eie.40523
- Jun 27, 2025
- Elektronika ir Elektrotechnika
- Research Article
- 10.5755/j02.eie.42747
- Jun 27, 2025
- Elektronika ir Elektrotechnika
- Research Article
- 10.5755/j02.eie.39824
- Jun 27, 2025
- Elektronika ir Elektrotechnika
- Research Article
- 10.5755/j02.eie.40069
- Apr 23, 2025
- Elektronika ir Elektrotechnika
- Research Article
- 10.5755/j02.eie.40016
- Apr 23, 2025
- Elektronika ir Elektrotechnika
- Research Article
- 10.5755/j02.eie.38758
- Apr 23, 2025
- Elektronika ir Elektrotechnika
- Research Article
- 10.5755/j02.eie.40795
- Apr 23, 2025
- Elektronika ir Elektrotechnika
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.