Abstract

Hand gestures are a form of non-verbal communication used by individuals in conjunction with speech to communicate. Nowadays, with the increasing use of technology, hand-gesture recognition is considered to be an important aspect of Human-Machine Interaction (HMI), allowing the machine to capture and interpret the user's intent and to respond accordingly. The ability to discriminate between human gestures can help in several applications, such as assisted living, healthcare, neuro-rehabilitation, and sports. Recently, multi-sensor data fusion mechanisms have been investigated to improve discrimination accuracy. In this paper, we present a sensor fusion framework that integrates complementary systems: the electromyography (EMG) signal from muscles and visual information. This multi-sensor approach, while improving accuracy and robustness, introduces the disadvantage of high computational cost, which grows exponentially with the number of sensors and the number of measurements. Furthermore, this huge amount of data to process can affect the classification latency which can be crucial in real-case scenarios, such as prosthetic control. Neuromorphic technologies can be deployed to overcome these limitations since they allow real-time processing in parallel at low power consumption. In this paper, we present a fully neuromorphic sensor fusion approach for hand-gesture recognition comprised of an event-based vision sensor and three different neuromorphic processors. In particular, we used the event-based camera, called DVS, and two neuromorphic platforms, Loihi and ODIN + MorphIC. The EMG signals were recorded using traditional electrodes and then converted into spikes to be fed into the chips. We collected a dataset of five gestures from sign language where visual and electromyography signals are synchronized. We compared a fully neuromorphic approach to a baseline implemented using traditional machine learning approaches on a portable GPU system. According to the chip's constraints, we designed specific spiking neural networks (SNNs) for sensor fusion that showed classification accuracy comparable to the software baseline. These neuromorphic alternatives have increased inference time, between 20 and 40%, with respect to the GPU system but have a significantly smaller energy-delay product (EDP) which makes them between 30× and 600× more efficient. The proposed work represents a new benchmark that moves neuromorphic computing toward a real-world scenario.

Highlights

  • Hand-gestures are considered a powerful communication channel for information transfer in daily life

  • While the input stays of the same size (16) with respect to the network implemented on Loihi, the input features are different since the baseline Multi-Layer Perceptron (MLP) receives Mean Absolute Value (MAV) and Root Mean Square (RMS) features while the Loihi receives spikes obtained from the raw signal

  • With the spiking MLP implemented on Loihi, we obtained an accuracy of 50.3 ± 1.5, 83.1 ± 3.4, and 83.4 ± 2.1% for the hand-gesture classification task using EMG, Dynamic Vision Sensor (DVS) and fusion, respectively

Read more

Summary

INTRODUCTION

Hand-gestures are considered a powerful communication channel for information transfer in daily life. Using EMG or camera systems separately presents some limitations, but their fusion has several advantages, in particular EMG-based classification can help in case of camera occlusion, whereas the vision classification provides an absolute measurement of hand state This type of sensor fusion which combines vision and proprioceptive information is intensively used in biomedical applications, such as in the transradial prosthetic domain, to improve control performance (Markovic et al, 2014, 2015), or to focus on recognizing objects during grasping to adjust the movements (Došen et al, 2010). To validate the neuromorphic results, we are comparing it to a baseline consisting of the network implemented, using a standard machine learning approach, where the inputs are fed as continuous EMG signals and video frames We propose this comparison for a real case scenario as a benchmark, in order for the neuromorphic research field to advance into mainstream computing (Davies, 2019)

MATERIALS AND METHODS
DVS and EMG Sensors
DVS-EMG Dataset
A Data collection setup
Neuromorphic Processors
Traditional Machine Learning Baselines
B DVS MLP
Loihi Results
EDP and Computational Complexity
DISCUSSIONS
DATA AVAILABILITY STATEMENT
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call