In this letter, we consider vision-based control tasks of which the desired goals are given as target images. The problems are often addressed by an autonomous agent which optimizes a trajectory to minimize a manually designed cost function. However, it is challenging to design a suitable cost function for each goal by hand, especially, when the current and the goal states of the system are only described by visual observations. In order to tackle this issue, we propose a method called dynamics-aware metric embedding (DAME), which generates cost functions in a self-supervised manner to help the agent plan the controls to accomplish the desired goals. The proposed method learns a metric function that reflects how easy to find a path connecting two states considering the dynamics of the system. To learn the metric between states, we utilize a measure named probabilistic reachability, which is computed using the probability of reaching from one state to the other state via random walk. We evaluate the proposed method in several vision-based control tasks, including both various simulation benchmarks and real-world table-top manipulation tasks, and observe that DAME outperforms other baseline algorithms by over <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\mathbf {\text{30}\%}$</tex-math></inline-formula> in the terms of success rate.
Read full abstract