Achieving human-level dexterity in robotics remains a critical open problem. Even simple dexterous manipulation tasks pose significant difficulties due to the high number of degrees of freedom and the need for cooperation among heterogeneous agents (e.g., finger joints). While some researchers have utilized reinforcement learning (RL) to control a single hand in manipulating objects, tasks that require coordinated bimanual cooperation are still under-explored due to the fewer suitable environments, which can result in difficulties and sub-optimal performance. To address these challenges, we introduce Bi-DexHands, a simulator with two dexterous hands featuring 20 bimanual manipulation tasks and thousands of target objects, designed to match various levels of human motor skills based on cognitive science research. We developed Bi-DexHands in Issac Gym, enabling highly efficient RL training at over 30,000 frames per second using a single NVIDIA RTX 3090. Based on Bi-DexHands, we present a comprehensive evaluation of popular RL algorithms in different settings, including single-agent/multi-agent RL, offline RL, multi-task RL, and meta RL. Our findings show that on-policy algorithms, such as PPO, can master simple manipulation tasks that correspond to those of 48-month-old babies, such as catching a flying object or opening a bottle. Furthermore, multi-agent RL can improve the ability to perform manipulations that require skilled bimanual cooperation, such as lifting a pot or stacking blocks. Despite achieving success in individual tasks, current RL algorithms struggle to learn multiple manipulation skills in most multi-task and few-shot learning scenarios. This highlights the need for further research and development within the RL community.
Read full abstract