Abstract

The assembly industry is shifting more towards customizable products, or requiring assembly of small batches. This requires a lot of reprogramming, which is expensive because a specialized engineer is required. It would be an improvement if untrained workers could help a cobot to learn an assembly sequence by giving advice. Learning an assembly sequence is a hard task for a cobot, because the solution space increases drastically when the complexity of the task increases. This work introduces a novel method where human knowledge is used to reduce this solution space, and as a result increases the learning speed. The method proposed is the IRL-PBRS method, which uses Interactive Reinforcement Learning (IRL) to learn from human advice in an interactive way, and uses Potential Based Reward Shaping (PBRS), in a simulated environment, to focus learning on a smaller part of the solution space. The method was compared in simulation to two other feedback strategies. The results show that IRL-PBRS converges more quickly to a valid assembly sequence policy and does this with the fewest human interactions. Finally, a use case is presented where participants were asked to program an assembly task. Here, the results show that IRL-PBRS learns quickly enough to keep up with advice given by a user, and is able to adapt online to a changing knowledge base.

Highlights

  • In recent years, the prices of industrial cobots have dropped significantly

  • The results show that Interactive Reinforcement Learning (IRL)-Potential Based Reward Shaping (PBRS) converges more quickly to a valid assembly sequence policy and does this with the fewest human interactions

  • The result shows that IRL-FB is not an option because it does not converge to a valid policy within a reasonable amount of time

Read more

Summary

Introduction

The prices of industrial cobots (and robots) have dropped significantly. Replacing that engineer with untrained workers (without programming skills) would reduce the cost of production, but it must become possible for these workers to program the cobots. In such a setting, where a cobot is collaborating with a human [3,4], it would be useful if the human could optimize the behavior of his cobot partner. Interactive Reinforcement Learning (IRL) is used, where the reward signal of the RL agent is dependent on the environment and the state, and on the advice from a human. IRL results in an optimal productivity because human and cobot work together, while the autonomy of the cobot increases over time and the workload of the human decreases

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call