Robust robotic exploration and mapping using generative occupancy map synthesis
Abstract We present a novel approach for enhancing robotic exploration by using generative occupancy mapping. We implement SceneSense, a diffusion model designed and trained for predicting 3D occupancy maps given partial observations. Our proposed approach probabilistically fuses these predictions into a running occupancy map in real-time, resulting in significant improvements in map quality and traversability. We deploy SceneSense on a quadruped robot and validate its performance with real-world experiments to demonstrate the effectiveness of the model. In these experiments we show that occupancy maps enhanced with SceneSense predictions better estimate the distribution of our fully observed ground truth data (24.44% FID improvement around the robot and 75.59% improvement at range). We additionally show that integrating SceneSense enhanced maps into our robotic exploration stack as a “drop-in” map improvement, utilizing an existing off-the-shelf planner, results in improvements in robustness and traversability time. Finally, we show results of full exploration evaluations with our proposed system in two dissimilar environments and find that locally enhanced maps provide more consistent exploration results than maps constructed only from direct sensor measurements.
- Research Article
- 10.1080/00194506.2025.2539239
- Aug 6, 2025
- Indian Chemical Engineer
This study investigates the potential of aqueous ammonia as a solvent for Direct Air Capture (DAC) of CO₂, leveraging its high solubility and reactivity with CO₂ despite its known volatility compared to conventional alkaline solutions such as NaOH and KOH. The research focuses on the evaporation behaviour of aqueous ammonia when air and inert gases are bubbled through a liquid column at varying flow rates. Results indicate that increased airflow enhances both CO₂ capture and ammonia evaporation. Experimental analysis were conducted across different ammonia concentrations, and the amount of ammonia utilised for CO₂ capture were evaluated based on both evaporation data and direct CO₂ sensor measurements. Notably, good agreement between the two methods was observed at fixed airflow rates. To systematically assess and optimise the effects of ammonia concentration and airflow rate, statistical analysis using response surface methodology (RSM) was performed, identifying optimal conditions that balance CO₂ absorption efficiency and ammonia loss. The findings underscore the dual role of ammonia in atmospheric CO₂ capture and highlight the trade-off posed by evaporation losses, offering valuable insights for improving ammonia-based DAC systems and informing broader applications in agriculture and chemical processing. HIGHLIGHTS Explores the use of aqueous ammonia solutions for Direct Air Capture (DAC). Investigates ammonia evaporation and atmospheric CO2 capture across different concentrations, and at a fixed airflow rate the calculated CO2 capture values closely matched with direct sensor measurements. Ammonia evaporation and CO₂ absorption are strongly influenced by airflow rate and ammonia concentration, which were statistically optimised using response surface methodology.
- Research Article
5
- 10.1007/s40295-014-0028-7
- Jun 1, 2013
- The Journal of the Astronautical Sciences
An attitude determination covariance measurement model for unit vector sensors with a wide field-of-view is analyzed and compared to the classic QUEST covariance model. The wide field-of-view model has been previously proposed as a more realistic alternative for sensors where measurement accuracy depends on angular distance from the boresight axis. Both QUEST and the wide field-of-view models are evaluated relative to a measurement model that uses the two-dimensional sensor focal plane measurements directly, rather than first converting them to unit vectors. The Fisher information matrix is derived for attitude determination based on such direct sensor measurements, and the wide field-of-view measurement model is shown to have the same Fisher information matrix. Numerical simulations confirm that an extended Kalman filter based on the wide field-of-view model outperforms a filter based on the QUEST measurement model, and also that the wide field-of-view 3σ bounds are effectively identical to those of a filter based on the direct two-dimensional sensor measurements.
- Conference Article
2
- 10.21437/interspeech.2014-102
- Sep 14, 2014
The selection of effective articulatory features is an important component of tasks such as acoustic-to-articulator inversion and articulatory synthesis. Although it is common to use direct articulatory sensor measurements as feature variables, this approach fails to incorporate important physiological information such as palate height and shape and thus is not as representative of vocal tract cross section as desired. We introduce a set of articulator feature variables that are palate referenced and normalized with respect to the articulatory working space in order to improve the quality of the vocal tract representation. These features include normalized horizontal positions plus the normalized palatal height of two midsagittal and one lateral tongue sensor, as well as normalized lip separation and lip protrusion. The quality of the feature representation is evaluated subjectively by comparing the variances and vowel separation in the working space and quantitatively through measurement of acoustic-to-articulator inversion error. Results indicate that the palate-referenced features have reduced variance and increased separation between vowels spaces and substantially lower inversion error than direct sensor measures.
- Conference Article
15
- 10.1109/iros.2003.1250712
- Dec 8, 2003
We develop a new occupancy map that respects the role of the sensor measurement bearing and how it relates to the resolution of the existing occupancy map. We borrow an idea from Konolige for recording and tracking, in an occupancy-like map, the bearing at which sensor readings originate with respect to a given cell. Our specific contribution is in the way we process the sensor pose information, which is the bearing of the sensor readings when it indicates the presence of an obstacle in a particular cell. For each cell in the occupancy map, we calculate the greatest separation of incident poses, and then store that information in a new two-dimensional array called a pose map. A cell in the pose map measures the quality of information contained in the corresponding cell of the occupancy map. We merge the new pose map with the existing map to generate an enhanced occupancy map. Exploration plans derived from the enhanced occupancy map are more efficient and complete in that they do not guide the robot around phantom obstacles nor incorrectly classify narrow openings as closed commonly found in conventional occupancy maps.
- Research Article
2
- 10.3390/s21217004
- Oct 22, 2021
- Sensors
Occupancy mapping is widely used to generate volumetric 3D environment models from point clouds, informing a robotic platform which parts of the environment are free and which are not. The selection of the parameters that govern the point cloud generation algorithms and mapping algorithms affects the process and the quality of the final map. Although previous studies have been reported in the literature on optimising major parameter configurations, research in the process to identify optimal parameter sets to achieve best occupancy mapping performance remains limited. The current work aims to fill this gap with a two-step principled methodology that first identifies the most significant parameters by conducting Neighbourhood Component Analysis on all parameters and then optimise those using grid search with the area under the Receiver Operating Characteristic curve. This study is conducted on 20 data sets with specially designed targets, providing precise ground truths for evaluation purposes. The methodology is tested on OctoMap with point clouds created by applying StereoSGBM on the images from a stereo camera. A clear indication can be seen that mapping parameters are more important than point cloud generation parameters. Moreover, up to 15% improvement in mapping performance can be achieved over default parameters.
- Conference Article
3
- 10.1109/cvprw56347.2022.00508
- Jun 1, 2022
We exploit the complementary strengths of vision and proprioception to develop a point-goal navigation system for legged robots, called VP-Nav. Legged systems are capable of traversing more complex terrain than wheeled robots, but to fully utilize this capability, we need a high-level path planner in the navigation system to be aware of the walking capabilities of the low-level locomotion policy in varying environments. We achieve this by using proprioceptive feedback to ensure the safety of the planned path by sensing unexpected obstacles like glass walls, terrain properties like slipperiness or softness of the ground and robot properties like extra payload that are likely missed by vision. The navigation system uses onboard cameras to generate an occupancy map and a corresponding cost map to reach the goal. A fast marching planner then generates a target path. A velocity command generator takes this as input to generate the desired velocity for the walking policy. A safety advisor module adds sensed unexpected obstacles to the occupancy map and environment-determined speed limits to the velocity command generator. We show superior performance compared to wheeled robot baselines, and ablation studies which have disjoint high-level planning and low-level control. We also show the real-world deployment of VP-Nav on a quadruped robot with onboard sensors and computation. Videos at https://navigation-locomotion.github.io
- Conference Article
24
- 10.1109/cvpr52688.2022.01676
- Jun 1, 2022
We exploit the complementary strengths of vision and pro-prioception to develop a point-goal navigation system for legged robots, called VP-Nav. Legged systems are capable of traversing more complex terrain than wheeled robots, but to fully utilize this capability, we need a high-level path planner in the navigation system to be aware of the walking capabilities of the low-level locomotion policy in varying environments. We achieve this by using proprioceptive feedback to ensure the safety of the planned path by sensing unexpected obstacles like glass walls, terrain properties like slipperiness or softness of the ground and robot properties like extra payload that are likely missed by vision. The navigation system uses onboard cameras to generate an occupancy map and a corresponding cost map to reach the goal. A fast marching planner then generates a target path. A velocity command generator takes this as input to generate the desired velocity for the walking policy. A safety advisor module adds sensed unexpected obstacles to the occupancy map and environment-determined speed limits to the velocity command generator. We show superior performance compared to wheeled robot baselines, and ablation studies which have disjoint high-level planning and low-level control. We also show the real-world deployment of VP-Nav on a quadruped robot with onboard sensors and computation. Videos at https://navigation-locomotion.github.io
- Conference Article
- 10.1109/cvprw56347.2022.00206
- Jun 1, 2022
We exploit the complementary strengths of vision and proprioception to develop a point-goal navigation system for legged robots, called VP-Nav. Legged systems are capable of traversing more complex terrain than wheeled robots, but to fully utilize this capability, we need a high-level path planner in the navigation system to be aware of the walking capabilities of the low-level locomotion policy in varying environments. We achieve this by using proprioceptive feedback to ensure the safety of the planned path by sensing unexpected obstacles like glass walls, terrain properties like slipperiness or softness of the ground and robot properties like extra payload that are likely missed by vision. The navigation system uses onboard cameras to generate an occupancy map and a corresponding cost map to reach the goal. A fast marching planner then generates a target path. A velocity command generator takes this as input to generate the desired velocity for the walking policy. A safety advisor module adds sensed unexpected obstacles to the occupancy map and environment-determined speed limits to the velocity command generator. We show superior performance compared to wheeled robot baselines, and ablation studies which have disjoint high-level planning and low-level control. We also show the real-world deployment of VP-Nav on a quadruped robot with onboard sensors and computation. Videos at https://navigation-locomotion.github.io
- Conference Article
- 10.58895/ksp/1000179597-4
- Sep 24, 2025
Semantic mapping for mobile robots is a crucial aspect for autonomous navigation and interaction with their environment. Map quality, accuracy of semantic labels and runtime are important dimensions for real-world applications. Voxel models have been widely used to represent occupancy in the map. In recent years, Bayesian Kernel Inference (BKI) has emerged as the main technique on top of traditional occupancy maps to produce smooth maps while maintaining an underlying probabilistic model. Methods based on BKI vary widely in their parameterization and can be tailored to different environments. In this report, we aim to develop the methodology for semantic mapping based on BKI, before evaluating different methods on a range of datasets to assess their parameterization effects for different use cases. We are targeting unstructured outdoor environments where the semantic mapping frameworks need to robustly handle uncertainties in perception.
- Research Article
- 10.3389/frobt.2025.1655171
- Jan 1, 2025
- Frontiers in Robotics and AI
Robust autonomous navigation in complex, dynamic indoor environments remains a central challenge in robotics, requiring agents to make adaptive decisions in real time under partial observability and uncertain obstacle motion. This paper presents DreamerNav, a robot-agnostic navigation framework that extends DreamerV3, a state-of-the-art world model–based reinforcement learning algorithm, with multimodal spatial perception, hybrid global–local planning, and curriculum-based training. By formulating navigation as a Partially Observable Markov Decision Process (POMDP), the system enables agents to integrate egocentric depth images with a structured local occupancy map encoding dynamic obstacle positions, historical trajectories, points of interest, and a global A* path. A Recurrent State-Space Model (RSSM) learns stochastic and deterministic latent dynamics, supporting long-horizon prediction and collision-free path planning in cluttered, dynamic scenes. Training is carried out in high-fidelity, photorealistic simulation using NVIDIA Isaac Sim, gradually increasing task complexity to improve learning stability, sample efficiency, and generalization. We benchmark against NoMaD, ViNT, and A*, showing superior success rates and adaptability in dynamic environments. Real-world proof-of-concept trials on two quadrupedal robots without retraining further validate the framework’s robustness and quadruped robot platform independence.
- Conference Article
73
- 10.1109/icra40945.2020.9197168
- May 1, 2020
3D point cloud maps are an accumulation of laser scans obtained at different positions and times. Since laser scans represent a snapshot of the surrounding at the time of capture, they often contain moving objects which may not be observed at all times. Dynamic objects in point cloud maps decrease the quality of maps and affect localization accuracy, hence it is important to remove the dynamic objects from 3D point cloud maps. In this paper, we present a robust method to remove dynamic objects from 3D point cloud maps. Given a registered set of 3D point clouds, we build an occupancy map in which the voxels represent the occupancy state of the volume of space over an extended time period. After building the occupancy map, we use it as a filter to remove dynamic points in lidar scans before adding the points to the map. Furthermore, we accelerate the process of building occupancy maps using object detection and a novel voxel traversal method. Once the occupancy map is built, dynamic object removal can run in real-time. Our approach works well on wide urban roads with stopped or moving traffic and the occupancy maps get better with the inclusion of more lidar scans from the same scene.
- Research Article
93
- 10.1007/s10514-017-9668-3
- Nov 6, 2017
- Autonomous Robots
© 2017, Springer Science+Business Media, LLC. Most of the existing robotic exploration schemes use occupancy grid representations and geometric targets known as frontiers. The occupancy grid representation relies on the assumption of independence between grid cells and ignores structural correlations present in the environment. We develop a Gaussian processes (GPs) occupancy mapping technique that is computationally tractable for online map building due to its incremental formulation and provides a continuous model of uncertainty over the map spatial coordinates. The standard way to represent geometric frontiers extracted from occupancy maps is to assign binary values to each grid cell. We extend this notion to novel probabilistic frontier maps computed efficiently using the gradient of the GP occupancy map. We also propose a mutual information-based greedy exploration technique built on that representation that takes into account all possible future observations. A major advantage of high-dimensional map inference is the fact that such techniques require fewer observations, leading to a faster map entropy reduction during exploration for map building scenarios. Evaluations using the publicly available datasets show the effectiveness of the proposed framework for robotic mapping and exploration tasks.
- Book Chapter
2
- 10.1007/978-3-030-25332-5_7
- Jan 1, 2019
OctoMap is a popular 3D mapping framework which can model the data consistently and keep the 3D models compact with the octree. However, the occupancy map derived by OctoMap can be incorrect when the input point clouds are with noisy measurements. Point cloud filters can reduce the noisy data, but it is unreasonable to apply filters in a sparse point cloud. In this paper, we present a k-nearest neighbours (k-NN) based inverse sensor model for occupancy mapping. This method represents the occupancy information of one point with the average distance from the point to its k-NN in the point cloud. The average distances derived by all the points and their corresponding k-NN are assumed to be normally distributed. Our inverse sensor model is presented based on this normal distribution. The proposed approach is able to deal with sparse and noisy point clouds. We implement the model in the OctoMap to carry out experiments in the real environment. The experimental results show that the 3D occupancy map generated by our approach is more reliable than that generated by the inverse sensor model in OctoMap.
- Conference Article
6
- 10.15607/rss.2020.xvi.090
- Jul 12, 2020
Creating accurate spatial representations that take into account uncertainty is critical for autonomous robots to safely navigate in unstructured environments. Although recent LIDAR based mapping techniques can produce robust occupancy maps, learning the parameters of such models demand considerable computational time, discouraging them from being used in real-time and large-scale applications such as autonomous driving. Recognizing the fact that real-world structures exhibit similar geometric features across a variety of urban environments, in this paper, we argue that it is redundant to learn all geometry dependent parameters from scratch. Instead, we propose a theoretical framework building upon the theory of optimal transport to adapt model parameters to account for changes in the environment, significantly amortizing the training cost. Further, with the use of high-fidelity driving simulators and real-world datasets, we demonstrate how parameters of 2D and 3D occupancy maps can be automatically adapted to accord with local spatial changes. We validate various domain adaptation paradigms through a series of experiments, ranging from inter-domain feature transfer to simulation-to-real-world feature transfer. Experiments verified the possibility of estimating parameters with a negligible computational and memory cost, enabling large-scale probabilistic mapping in urban environments.
- Research Article
13
- 10.1109/access.2019.2935547
- Jan 1, 2019
- IEEE Access
In this paper, we propose a model-free volumetric Next Best View (NBV) algorithm for accurate 3D reconstruction using a Markov Chain Monte Carlo method for high-mix-low-volume objects in manufacturing. The volumetric information gain based Next Best View algorithm can in real-time select the next optimal view that reveals the maximum uncertainty of the scanning environment with respect to a partially reconstructed 3D Occupancy map, without any priori knowledge of the target. Traditional Occupancy grid maps make two independence assumptions for computational tractability but suffer from the overconfident estimation of the occupancy probability for each voxel leading to less precise surface reconstructions. This paper proposes a special case of the Markov Chain Monte Carlo (MCMC) method, the Gibbs sampler, to accurately estimate the posterior occupancy probability of a voxel by randomly sampling from its high-dimensional full posterior occupancy probability given the entire volumetric map with respect to the forward sensor model with a Gaussian distribution. Numerical experiments validate the performance of the MCMC Gibbs sampler algorithm under the ROS-Industry framework to prove the accuracy of the reconstructed Occupancy map and the completeness of the registered point cloud. The proposed MCMC Occupancy mapping could be used to optimise the tuning parameters of the online NBV algorithms via the inverse sensor model to realise industry automation.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.