I Can Hear Your Alexa: Voice Command Fingerprinting on Smart Home Speakers

Boyang Wang,Chenggang Wang,Haipeng Li,Sean Kennedy,Wenhai Sun,Hao Liu

doi:10.1109/cns.2019.8802686

Abstract

Millions of smart home speakers, such as Amazon Echo and Google Home, have been purchased by U.S. consumers. However, the security and privacy of smart home speakers have not been rigorously examined, which raise critical security and privacy concerns. In this paper, we investigate untold and severe privacy leakage of smart home speakers. Specifically, we examine a new passive attack, referred to as voice command fingerprinting attack, on smart home speakers. We demonstrate that a passive attacker, who can only eavesdrop encrypted traffic between a smart home speaker and a cloud server, can infer users' voice commands and compromise the privacy of millions of U.S. consumers. We formulate the attacks by harnessing machine learning algorithms. In addition to leveraging accuracy, we propose a new privacy metric, named semantic distance, to assess the privacy leakage with natural language processing. Our experiment results on a real-world dataset suggest that voice command fingerprinting attacks can correctly infer 33.8% of voice commands by eavesdropping encrypted traffic. Our results also show that existing padding methods can diminish an attacker's accuracy to 14.7%, but would cause high communication overhead (548%) and long time delay (330%).

Full Text