Abstract

Human-robot collaboration (HRC) depends on the detection of objects in the environment to perform critical tasks in a safe manner. This is usually done through object detection on robot's raw sensors data. However, this approach can not capture the semantic and contextual information from the environment. From that, scene understanding overcomes this limitation by combining different perception tasks to gather semantic knowledge from a given scene. Moreover, robots may take full benefit of semantic knowledge during safety analysis, as it will provide a richer information of the environment, commonly through scene graphs. We present our work on the comparison of two AI-based scene understanding methods for risk management in HRC. The first implementation is a contextual semantics based scene graph generator method that is integrated with the state-of-the-art Mask R-CNN object detection. Whereas the second implementation adopts an end-to-end multi-level scene description neural network (MSDN) to predict the scene graphs and region captions. The object detection methods employed in both solutions were evaluated by measuring the mean average precision. The overall scene understanding performance was based on the average inference time and top (50 and 100) recall on predicted scene graph relationships. From the experiments, MSDN consistently outperformed contextual semantics based method in top (50 and 100) recall by 20-25% for scene graph generation task. Although, contextual semantics based implementation has lower scene graph generation inference time, which is 2.4 times faster than the end-to-end neural network based implementation. All the experiments were performed in a simulated warehouse scenario where autonomous mobile robots and humans have close interaction.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.