Abstract

In actual welding scenarios, an effective path planner is needed to find a collision-free path in the configuration space for the welding manipulator with obstacles around. However, as a state-of-the-art method, the sampling-based planner only satisfies the probability completeness and its computational complexity is sensitive with state dimension. In this paper, we propose a path planner for welding manipulators based on deep reinforcement learning for solving path planning problems in high-dimensional continuous state and action spaces. Compared with the sampling-based method, it is more robust and is less sensitive with state dimension. In detail, to improve the learning efficiency, we introduce the inverse kinematics module to provide prior knowledge while a gain module is also designed to avoid the local optimal policy, we integrate them into the training algorithm. To evaluate our proposed planning algorithm in multiple dimensions, we conducted multiple sets of path planning experiments for welding manipulators. The results show that our method not only improves the convergence performance but also is superior in terms of optimality and robustness of planning compared with most other planning algorithms.

Highlights

  • Welding tasks exist in various industrial manufacturing processes

  • We give an overview of the background theories for the proposed Deep Reinforcement Learning (Deep-Reinforcement Learning (RL))-based collision-free path planner, including the kinematics modeling of a welding manipulator used in this research, the Sequential DecisionMaking model, and the model-free Deep-RL algorithm: Deep Deterministic Policy Gradient (DDPG)

  • It is known that the objective of the Deep-RL algorithm is to maximize the cumulative reward in one episode with finite time steps and, it is necessary to analyze the trend of the reward curve or the learning curve vs training episode number which reflects whether the target deterministic policy model has converged as well as the learning efficiency

Read more

Summary

Introduction

Welding tasks exist in various industrial manufacturing processes. Shipbuilding, known as a labor-intensive industry, requires a considerable number of skilled technicians to weld in enclosed and hazardous surroundings. The samplingbased method is one of the most popular path planning methods owing to its probabilistic completeness. We give an overview of the background theories for the proposed Deep-RL-based collision-free path planner, including the kinematics modeling of a welding manipulator used in this research, the Sequential DecisionMaking model, and the model-free Deep-RL algorithm: DDPG. Most of the motion specified by the task is defined in Cartesian space, it is inevitable to map the end-effector’s. Given the end-effector’s Cartesian velocity xe relative to itself, the corresponding velocity of joints’ angles qis as follows: q = J (q)† xe, (4).

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call