Abstract
The ubiquity of modern smartphones, because they are equipped with a wide range of sensors, poses a potential security risk---malicious actors could utilize these sensors to detect private information such as the keystrokes a user enters on a nearby keyboard. Existing studies have examined the ability of phones to predict typing on a nearby keyboard but are limited by the realism of collected typing data, the expressiveness of employed prediction models, and are typically conducted in a relatively noise-free environment. We investigate the capability of mobile phone sensor arrays (using audio and motion sensor data) for classifying keystrokes that occur on a keyboard in proximity to phones around a table, as would be common in a meeting. We develop a system of mixed convolutional and recurrent neural networks and deploy the system in a human subjects experiment with 20 users typing naturally while talking. Using leave-one-user-out cross validation, we find that mobile phone arrays have the ability to detect 41.8% of keystrokes and 27% of typed words correctly in such a noisy environment---even without user specific training. To investigate the potential threat of this attack, we further developed the machine learning models into a realtime system capable of discerning keystrokes from an array of mobile phones and evaluated the system's ability with a single user typing in varying conditions. We conclude that, in order to launch a successful attack, the attacker would need advanced knowledge of the table from which a user types, and the style of keyboard on which a user types. These constraints greatly limit the feasibility of such an attack to highly capable attackers and we therefore conclude threat level of this attack to be low, but non-zero.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have