Voice Liveness Detection for Voice Assistants Through Ear Canal Pressure Monitoring

Jiacheng Shang,Jie Wu

doi:10.1109/tnse.2021.3138699

Abstract

The voice assistants are important input devices in future smart homes. Thanks to the great performance that is provided by current voice recognition systems, current voice assistants can understand various commands in different language from users and connect with other devices in the same local network to perform corresponding actions. However, the voices are not secure due to its nature. Even if we secure voice assistant using the voiceprint, attackers can still steal the victim's voices and replay them to voice assistants for attacking purpose. In this paper, we propose a new voice liveness detection system that is specifically designed for voice assistants. The key insight behind our system is that users will open their mouth when they say some phonemes. Such opening mouth activities will impact the air pressure in the ear canal if the ear canal is an enclosed space. Therefore, we can detect the liveness of the voices on the side of voice assistants by cauterizing the correlations between each sentence and the air pressure. Experiments with ten volunteers show that our system can accurately accept voice commands from legitimate users with accuracy of 94.8% and 97%. Moreover, our system can effectively defend current voice assistant devices from replay attacks with accuracy of 99.25% and 99.5%.

Full Text