Abstract
Due both to the speed and quality of their sensors and restrictive on-board computational capabilities, current state-of-the-art (SOA) size, weight, and power (SWaP) constrained autonomous robotic systems are limited in their abilities to sample, fuse, and analyze sensory data for state estimation. Aimed at improving SWaP-constrained robotic state estimation, we present Multi-Hypothesis DeepEfference (MHDE) — an unsupervised, deep convolutional-deconvolutional sensor fusion network that learns to intelligently combine noisy heterogeneous sensor data to predict several probable hypotheses for the dense, pixel-level correspondence between a source image and an unseen target image. This new multi-hypothesis formulation of our previous architecture, DeepEfference [1], has been augmented to handle dynamic heteroscedastic sensor and motion noise and computes hypothesis image mappings and predictions at 150–400 Hz depending on the number of hypotheses being generated. MHDE fuses noisy, heterogeneous sensory inputs using two parallel architectural pathways and n (1, 2, 4, or 8 in this work) multi-hypothesis generation subpathways to generate n pixel-level predictions and correspondences between source and target images. We evaluated MHDE on the KITTI Odometry dataset [2] and benchmarked it against DeepEfference [1] and DeepMatching [3] by mean pixel error and runtime. MHDE with 8 hypotheses outperformed DeepEfference in root mean squared (RMSE) pixel error by 103% in the maximum heteroscedastic noise condition and by 18% in the noise-free condition. MHDE with 8 hypotheses was over 5, 000% faster than DeepMatching with only a 3% increase in RMSE.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have