Visual Robot Relocalization Based on Multi-Task CNN and Image-Similarity Strategy.

Tao Xie,Ke Wang,Xinyue Tang,Ruifeng Li

doi:10.3390/s20236943

Tao Xie, Ke Wang + Show 2 more

Open Access

https://doi.org/10.3390/s20236943

Copy DOI

Abstract

The traditional CNN for 6D robot relocalization which outputs pose estimations does not interpret whether the model is making sensible predictions or just guessing at random. We found that convnet representations trained on classification problems generalize well to other tasks. Thus, we propose a multi-task CNN for robot relocalization, which can simultaneously perform pose regression and scene recognition. Scene recognition determines whether the input image belongs to the current scene in which the robot is located, not only reducing the error of relocalization but also making us understand with what confidence we can trust the prediction. Meanwhile, we found that when there is a large visual difference between testing images and training images, the pose precision becomes low. Based on this, we present the dual-level image-similarity strategy (DLISS), which consists of two levels: initial level and iteration-level. The initial level performs feature vector clustering in the training set and feature vector acquisition in testing images. The iteration level, namely, the PSO-based image-block selection algorithm, can select the testing images which are the most similar to training images based on the initial level, enabling us to gain higher pose accuracy in testing set. Our method considers both the accuracy and the robustness of relocalization, and it can operate indoors and outdoors in real time, taking at most 27 ms per frame to compute. Finally, we used the Microsoft 7Scenes dataset and the Cambridge Landmarks dataset to evaluate our method. It can obtain approximately 0.33 m and 7.51 accuracy on 7Scenes dataset, and get approximately 1.44 m and 4.83 accuracy on the Cambridge Landmarks dataset. Compared with PoseNet, our CNN reduced the average positional error by 25% and the average angular error by 27.79% on 7Scenes dataset, and reduced the average positional error by 40% and the average angular error by 28.55% on the Cambridge Landmarks dataset. We show that our multi-task CNN can localize from high-level features and is robust to images which are not in the current scene. Furthermore, we show that our multi-task CNN gets higher accuracy of relocalization by using testing images obtained by DLISS.

Highlights

The problem of robot relocalization [1] refers to inferring the translation and orientation of a robot from the visual scene representation given only a single image
The results of “Our methods without scene recognition” were gained by transmitting the testing images which were selected by dual-level image-similarity strategy (DLISS) to the trained 6D relocalization network without scene recognition
The results of “Our methods without using DLISS” were gained by transmitting the testing images which were gotten by center cropping and zooming the raw testing images to the trained 6D relocalization network with scene recognition

Summary

Introduction

The problem of robot relocalization [1] refers to inferring the translation and orientation of a robot from the visual scene representation given only a single image. Most of learning-based algorithms adopt a similar CNN structure, extracting features by using a trained model which is trained on large-scale data of image classification, and returning the pose. The authors in [23] proposed Bayesian PoseNet. In this paper, we present our approach which adopts an end-to-end multi-task CNN for 6-DOF pose estimation and scene recognition by using only RGB images. Besides using multi-task CNN, another contribution on the improvement of relocalization accuracy is that: we present a block selection algorithm for a new input image, which is based on particle swarm optimization to find the most similar block to some training images in the training set.

The Specific Methods and Measures

Backbone Network

Multi-Task Learning for Pose Regression and Scene Recognition

Dual-Level Image-Similarity Strategy

Experiments

Training Details for 6D Relocalization Network

Initialization of PSO-Based Image-Block Selection

Feature Representation in Pose Regression and Scene Recognition

Feature Vector Clustering

Methods

The Robustness of the PSO-Based Image-Block Selection Algorithm

The Reliability of the Dual-Level Image-Similarity Strategy

Experimental Results and Discussion

Efficiency of Our Network

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Sensors (Basel, Switzerland)	Publication Date: Dec 4, 2020
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Visual Robot Relocalization Based on Multi-Task CNN and Image-Similarity Strategy.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)

Lead the way for us

Similar Papers

Face recognition based on ridgelet transforms
Milind Rane ... Satyajit Kautkar
Procedia Computer Science | VOL. 2
Milind Rane, et. al.Milind Rane ... Satyajit Kautkar
01 Jan 2009
Procedia Computer Science | VOL. 2

Super-Resolved Recognition of License Plate Characters
Sung-Jin Lee ... Seok Bong Yoo
Mathematics | VOL. 9
Sung-Jin Lee, et. al.Sung-Jin Lee ... Seok Bong Yoo
05 Oct 2021
Mathematics | VOL. 9

Classification of Image Database Using Independent Principal Component Analysis
Tanuja K ... H B
International Journal of Advanced Computer Science and Applications | VOL. 4
Tanuja K, et. al.Tanuja K ... H B
01 Jan 2013
International Journal of Advanced Computer Science and Applications | VOL. 4

Person identification in Ethnic Indian Goans using ear biometrics and neural networks
Ajit D Dinkar ... Shruti S Sambyal
Forensic Science International | VOL. 223
Ajit D Dinkar, et. al.Ajit D Dinkar ... Shruti S Sambyal
11 Sep 2012
Forensic Science International | VOL. 223

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Visual Robot Relocalization Based on Multi-Task CNN and Image-Similarity Strategy.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)