Effect of display of YOLO’s object recognition results to HMD for an operator controlling a mobile robot

Yuichi Sasaki,Tetsushi Kamegawa,Akio Gofuku

doi:10.1007/s10015-023-00856-0

Abstract

An operator feels a burden when he/she controls a rescue robot remotely because he/she has to keep watching camera images to find the target object. We think that this burden can be reduced by the combination of Head Mounted Display (HMD) and object recognition by deep learning. In the first half part of this study, we examine the effect that how presentation method by You Only Look Once (YOLO), a deep learning algorithm, and its recognition results to an operator wearing HMD. In the experiment, three methods of presentation were set: no display of object recognition, display only one object recognition result, and display 80 kinds of object recognition results. Under each presentation method, we measured the time it took for the operator to operate the robot and complete the given task. Additionally, we ask a questionnaire for each experiment. The results of the questionnaire showed that the method to present only one object recognition result was useful. In the second half part of this study, we develop a system to present 3D images with YOLO added, to further ease the burden of object search. Furthermore, we numerically prove that this system represents depth. In the experiment, two methods of displaying were set up: 2D images with Bounding Box (BB) by YOLO and 3D images with BB by YOLO. For each method of presentation, the operator operated the robot and recorded the number of objects found within a time limit. Additionally, we asked a questionnaire at the end of the search in each condition and at the end of all the experiments. The results of the questionnaire suggested points that need to be improved. Furthermore, we consider the flicker of the image found in the experiment.

Full Text