I Spy You

Shijia Zhang,Mahanth Gowda,Yilin Liu

doi:10.1145/3569486

Abstract

This paper presents iSpyU, a system that shows the feasibility of recognition of natural speech content played on a phone during conference calls (Skype, Zoom, etc) using a fusion of motion sensors such as accelerometer and gyroscope. While microphones require permissions from the user to be accessible by an app developer, the motion sensors are zero-permission sensors, thus accessible by a developer without alerting the user. This allows a malicious app to potentially eavesdrop on sensitive speech content played by the user's phone. In designing the attack, iSpyU tackles a number of technical challenges including: (i) Low sampling rate of motion sensors (500 Hz in comparison to 44 kHz for a microphone). (ii) Lack of availability of large-scale training datasets to train models for Automatic Speech Recognition (ASR) with motion sensors. iSpyU systematically addresses these challenges by a combination of techniques in synthetic training data generation, ASR modeling, and domain adaptation. Extensive measurement studies on modern smartphones show a word level accuracy of 53.3 - 59.9% over a dictionary of 2000-10000 words, and a character level accuracy of 70.0 - 74.8%. We believe such levels of accuracy poses a significant threat when viewed from a privacy perspective.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Proceedings of the ACM on interactive, mobile, wearable and ubiquitous technologies	Publication Date: Dec 21, 2022
Citations: 4	License type: public-domain

R Discovery Prime

R Discovery Prime

I Spy You

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ACM on interactive, mobile, wearable and ubiquitous technologies

Lead the way for us

Similar Papers

ETEH: Unified Attention-Based End-to-End ASR and KWS Architecture
Gaofeng Cheng ... Haoran Miao
IEEE/ACM transactions on audio, speech, and language processing | VOL. 30
Gaofeng Cheng, et. al.Gaofeng Cheng ... Haoran Miao
01 Jan 2021
IEEE/ACM transactions on audio, speech, and language processing | VOL. 30

OkwuGbé: End-to-End Speech Recognition for Fon and Igbo
...
-
, et. al. ...
21 Oct 2021
21 Oct 2021

Recognition of target domain Japanese speech using language model replacement
Daiki Mori ... Norihide Kitaoka
Eurasip Journal on Audio, Speech, and Music Processing | VOL. 2024
Daiki Mori, et. al.Daiki Mori ... Norihide Kitaoka
20 Jul 2024
Eurasip Journal on Audio, Speech, and Music Processing | VOL. 2024

Development and comparison of ASR models using kaldi for noisy and enhanced kannada speech data
G Thimmaraja Yadava ... H S Jayanna
-
G Thimmaraja Yadava, et. al.G Thimmaraja Yadava ... H S Jayanna
01 Sep 2017
01 Sep 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

I Spy You

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ACM on interactive, mobile, wearable and ubiquitous technologies