Human-robot collaborative disassembly (HRCD) has gained much interest in the disassembly tasks of end-of-life products, integrating both robot’s high efficiency in repetitive works and human’s flexibility with higher cognition. Explicit human-object perceptions are significant but remain little reported in the literature for adaptive robot decision-makings, especially in the close proximity co-work with partial occlusions. Aiming to bridge this gap, this study proposes a vision-based 3D dense hand-object pose estimation approach for HRCD. First, a mask-guided attentive module is proposed to better attend to hand and object areas, respectively. Meanwhile, explicit consideration of the occluded area in the input image is introduced to mitigate the performance degradation caused by visual occlusion, which is inevitable during HRCD hand-object interactions. In addition, a 3D hand-object pose dataset is collected for a lithium-ion battery disassembly scenario in the lab environment with comparative experiments carried out, to demonstrate the effectiveness of the proposed method. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Note to Practitioners</i> —This work aims to overcome the challenge of joint hand-object pose estimation in a human-robot collaborative disassembly scenario, of which can also be applied to many other close-range human-robot/machine collaboration cases with practical values. The ability to accurately perceive the pose of the human hand and workpiece under partial occlusion is crucial for the collaborative robot to successfully carry out co-manipulation with human operators. This paper proposes an approach that can jointly estimate the 3D pose of the hand and object in an integrated model. An explicit prediction of the occlusion area is then introduced as a regularization term during model training. This can make the model more robust to partial occlusion between the hand and object. The comparative experiments suggest that the proposed approach outperforms many existing hand-object estimation ones. Nevertheless, the dependency on manually labeled training data can limit its application. In the future, we will consider semi-supervised or unsupervised training to address this issue and achieve faster adaptation to different industrial scenarios.
Read full abstract