Human-swarm interaction using spatial gestures

Jawad Nagi,Alessandro Giusti,Gianni A Di Caro,Luca M Gambardella

doi:10.1109/iros.2014.6943101

Abstract

This paper presents a machine vision based ap- proach for human operators to select individual and groups of autonomous robots from a swarm of UAVs. The angular distance between the robots and the human is estimated using measures of the detected human face, which aids to determine human and multi-UAV localization and positioning. In turn, this is exploited to effectively and naturally make the human select the spatially situated robots. Spatial gestures for selecting robots are presented by the human operator using tangible input devices (i.e., colored gloves). To select individuals and groups of robot we formulate a vocabulary of two-handed spatial pointing gestures. With the use of a Support Vector Machine (SVM) trained in a cascaded multi-binary-class configuration, the spatial gestures are effectively learned and recognized by a swarm of UAVs. I. INTRODUCTION Without the use of teleoperated and hand-held interaction devices, human operators generally face difficulties in select- ing and commanding individual and groups of robots from a relatively large group of spatially distributed robots (i.e., a swarm). However, due to the widespread availability of cost effective digital cameras onboard UGVs and UAVs, it is increasing the attention towards developing uninstrumented methods (i.e., methods that do not use sophisticated hardware devices from the human side) for human-swarm interaction (HSI). In previous work, we focused on learning efficient features incrementally (online) from multi-viewpoint images of multiple gestures that were acquired by a swarm of ground robots (1). In this paper, we present a cascaded supervised machine learning approach to deal with the machine vision problem of selecting 3D spatially-situated robots from a networked swarm based on the recognition of spatial hand gestures. These are a natural, easy recognizable, and device- less way to enable human operators to easily interact with external artifacts such as robots. Inspired by natural human behavior, we propose an ap- proach that combines face engagement and pointing gestures to interact with a swarm of robots: standing in front of a population of robots, by looking at them and pointing at them with spatial gestures, a human operator can designate individual or groups of robots of determined size. Robots cooperate to combine their independent observations of the human's face and gestures to cooperatively determine which robots were addressed (i.e., selected). While state of the art computer vision techniques pro- vide excellent face detection, human skeleton, and gesture recognition in ideal conditions, there are often occlusions,

Full Text