Abstract

Voice-based authentication is prevalent on smart devices to verify the legitimacy of users, but is vulnerable to replay attacks. In this paper, we propose to leverage the distinctive chest motions during speaking to establish a secure multi-factor authentication system, named ChestLive. Compared with other biometric-based authentication systems, ChestLive does not require users to remember any complicated information (e.g., hand gestures, doodles) and the working distance is much longer (30cm). We use acoustic sensing to monitor chest motions with a built-in speaker and microphone on smartphones. To obtain fine-grained chest motion signals during speaking for reliable user authentication, we derive Channel Energy (CE) of acoustic signals to capture the chest movement, and then remove the static and non-static interference from the aggregated CE signals. Representative features are extracted from the correlation between voice signal and corresponding chest motion signal. Unlike learning-based image or speech recognition models with millions of available training samples, our system needs to deal with a limited number of samples from legitimate users during enrollment. To address this problem, we resort to meta-learning, which initializes a general model with good generalization property that can be quickly fine-tuned to identify a new user. We implement ChestLive as an application and evaluate its performance in the wild with 61 volunteers using their smartphones. Experiment results show that ChestLive achieves an authentication accuracy of 98.31% and less than 2% of false accept rate against replay attacks and impersonation attacks. We also validate that ChestLive is robust to various factors, including training set size, distance, angle, posture, phone models, and environment noises.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call