Effect of Kinematics and Fluency in Adversarial Synthetic Data Generation for ASL Recognition With RF Sensors

Ali Cafer Gurbuz,Evie A. Malaia,Sevgi Zubeyde Gurbuz,Chris Crawford,Mohammad Mahbubur Rahman,Darrin J. Griffin

doi:10.1109/taes.2021.3139848

Abstract

RF sensors have been recently proposed as a new modality for sign language processing technology. They are non-contact, effective in the dark, and acquire a direct measurement of signing kinematic via exploitation of the micro-Doppler effect. First, this work provides an in depth, comparative examination of the kinematic properties of signing as measured by RF sensors for both fluent ASL users and hearing imitation signers. Second, as ASL recognition techniques utilizing deep learning requires a large amount of training data, this work examines the effect of signing kinematics and subject fluency on adversarial learning techniques for data synthesis. Two different approaches for the synthetic training data generation are proposed: 1) adversarial domain adaptation to minimize the differences between imitation signing and fluent signing data, and 2) kinematically-constrained generative adversarial networks for accurate synthesis of RF signing signatures. The results show that the kinematic discrepancies between imitation signing and fluent signing are so significant that training on data directly synthesized from fluent RF signers offers greater performance (93% top-5 accuracy) than that produced by adaptation of imitation signing (88% top-5 accuracy) when classifying 100 ASL signs.

Full Text