Abstract

Voice authentication has been increasingly adopted for sensitive operations on mobile devices. While voice biometrics can distinguish individuals by their spectral features (such as voiceprints), they are known to be prone to spoofing attacks, where malicious attackers can use pre-recorded or synthesized samples from legitimate users or impersonate the speaking style of the targeted user to deceive the voice authentication system. In this paper, we design and implement a novel software-only anti-spoofing system on smartphones. Our system leverages the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">pop noise</i> , which is generated by the user’s oral airflow when speaking the passphrase close to the microphone. The pop noise is delicate and subject to user diversity, making it hard to be recorded by replay attacks beyond a certain distance or to be imitated precisely by impersonators. Specifically, we design a new pop noise detection scheme to pinpoint pop noises at the phonemic level, based on which we establish a theoretical model to calculate the sound pressure level from the speech signal in order to get the estimated pressure signal, and then analyze the consistency with the actual pressure signal extracted from the pop noise. Furthermore, we calculate the similarity score of the unique sequences which describe the individually unique relationship between pop noises and phonemes to resist spoofing attacks. Our evaluation on a dataset of 30 participants and three smartphones shows that our system achieves over 94.79% accuracy. Our system requires no additional hardware and is robust to various factors including authentication angle, authentication distance, the length of passphrase, ambient noise, etc.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call