Abstract

In this study, we present a transfer learning method for gesture classification via an inductive and supervised transductive approach with an electromyographic dataset gathered via the Myo armband. A ternary gesture classification problem is presented by states of ’thumbs up’, ’thumbs down’, and ’relax’ in order to communicate in the affirmative or negative in a non-verbal fashion to a machine. Of the nine statistical learning paradigms benchmarked over 10-fold cross validation (with three methods of feature selection), an ensemble of Random Forest and Support Vector Machine through voting achieves the best score of 91.74% with a rule-based feature selection method. When new subjects are considered, this machine learning approach fails to generalise new data, and thus the processes of Inductive and Supervised Transductive Transfer Learning are introduced with a short calibration exercise (15 s). Failure of generalisation shows that 5 s of data per-class is the strongest for classification (versus one through seven seconds) with only an accuracy of 55%, but when a short 5 s per class calibration task is introduced via the suggested transfer method, a Random Forest can then classify unseen data from the calibrated subject at an accuracy of around 97%, outperforming the 83% accuracy boasted by the proprietary Myo system. Finally, a preliminary application is presented through social interaction with a humanoid Pepper robot, where the use of our approach and a most-common-class metaclassifier achieves 100% accuracy for all trials of a ‘20 Questions’ game.

Highlights

  • Within a social context, the current state of Human-Robot Interaction is arguably most often concerned with the domain of verbal, spoken communication

  • This study showed that a physical task (’pick up the cube’) could be completed on average in less time than with joystick hardware, but that the transfer learning process allowed for 97.81% classification accuracy of the EMG data produced by the movements of 17 individual subjects

  • The chosen machine learning techniques are benchmarked in order to select the most promising method for the problem presented in this study

Read more

Summary

Introduction

The current state of Human-Robot Interaction is arguably most often concerned with the domain of verbal, spoken communication. The transcription of spoken language to text, and further Natural Language Processing (NLP) in order to extract meaning; this framework is oftentimes multi-modally combined with other data, such as the tone of voice, which too carries useful information With this in mind, a recent National GP Survey carried out in the United Kingdom found that 125,000 adults and 20,000 children had the ability to converse in British Sign Language (BSL) (Ipsos 2016), and of those surveyed, 15,000 people reported it as their primary language. A related study, performing classification with CNN succesfully classified 9 physical movements from 9 subjects at a mean accuracy of 94.18% (Mendez et al 2017); it must be noted, that in this work, the model was not tested for generalisation ability This has shown to be important in this study, since the strongest method for classification of the dataset was weaker than another model when it came to transfer of ability to unseen data

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call