Abstract

We consider the problem of path planning in an initially unknown environment where a robot does not have an a priori map of its environment but has access to prior information accumulated by itself from navigation in similar but not identical environments. To address the navigation problem, we propose a novel, machine learning-based algorithm called Semi-Markov Decision Process with Unawareness and Transfer (SMDPU-T) where a robot records a sequence of its actions around obstacles as action sequences called options which are then reused by it within a framework called Markov Decision Process with unawareness (MDPU) to learn suitable, collision-free maneuvers around more complex obstacles in future. We have analytically derived the cost bounds of the selected option by SMDPU-T and the worst case time complexity of our algorithm. Our experimental results on simulated robots within Webots simulator illustrate that SMDPU-T takes $$24\%$$ planning time and $$39\%$$ total time to solve same navigation tasks while, our hardware results on a Turtlebot robot indicate that SMDPU-T on average takes $$53\%$$ planning time and $$60\%$$ total time as compared to a recent, sampling-based path planner.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call