Abstract

This paper presents the visual navigation method for determining the position and orientation of a ground robot using a diffusion map of robot images (obtained from a camera in an upper position—e.g., tower, drone) and for investigating robot stability with respect to desirable paths and control with time delay. The time delay appears because of image processing for visual navigation. We consider a diffusion map as a possible alternative to the currently popular deep learning, comparing the possibilities of these two methods for visual navigation of ground robots. The diffusion map projects an image (described by a point in multidimensional space) to a low-dimensional manifold preserving the mutual relationships between the data. We find the ground robot’s position and orientation as a function of coordinates of the robot image on the low-dimensional manifold obtained from the diffusion map. We compare these coordinates with coordinates obtained from deep learning. The algorithm has higher accuracy and is not sensitive to changes in lighting, the appearance of external moving objects, and other phenomena. However, the diffusion map needs a larger calculation time than deep learning. We consider possible future steps for reducing this calculation time.

Highlights

  • Deep learning [1,2,3,4,5,6,7] is a very popular and powerful instrument for arriving at the solution of complex problems of classification and function regression.The main advantage of this method is that we need not develop some complex features describing the group of investigated objects

  • We find the ground robot coordinates as a function of coordinates of the robot image on the low-dimensional manifold obtained from the diffusion map

  • We offer an important practical example, demonstrating that the diffusion map can compete with deep learning

Read more

Summary

Introduction

Deep learning (based on artificial neural networks) [1,2,3,4,5,6,7] is a very popular and powerful instrument for arriving at the solution of complex problems of classification and function regression. The main advantage of this method is that we need not develop some complex features describing the group of investigated objects. “PoseNet is [1] is based on the GoogLeNet architecture It processes RGB-images and is modified so that all three softmax and fully connected layers are removed from the original model and replaced by regressors in the training phase. In the testing phase the other two regressors of the lower layers are removed and the prediction is done solely based on the regressor on the top of the whole network

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.