Abstract

Recently, smart voice assistants (VAs) are widely deployed to provide control services via voice commands in IoT systems, e.g., smart home, industrial IoT systems, etc. However, due to the complexity of the application environment and the diversity of voice commands, more and more attacks against VAs cause severe security problems. As voice development platforms allow third-party voice skills to be accessed, adversaries are able to obtain users’ private information by squatting attacks using confusing names. The existing work studied the exploitability of semantic misinterpretation in VA systems on phonetic languages such as English. However, due to the semantic structural difference between phonetic English and symbol-based Asian languages, such as Chinese, the linguistic-model-guided fuzzing tool proposed by the previous work is insufficient to conduct semantic analysis on the VAs of Asian Languages. In this article, we conduct a systematic analysis to evaluate the feasibility of voice misinterpretation attacks to typical Asian language VAs through semantic fuzzing. We develop Harmony-Fuzzer, the semantic fuzzing tool that the fuzzing process is under the guidance of fuzzing rules abstracted from phenomena of speech errors, disfluency, or semantically similar expressions in Chinese corpus. We use Bayesian networks to formulate fuzzing models statistically so that the fuzzing space can be controlled by the probability of fuzzing processing. We use our results to test VAs and design malicious skills to empirically verify the feasibility of squatting attacks. We found that squatting attacks on Chinese VAs are feasible when attackers leverage some linguistic phenomena delicately.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call