Voice-over-IP (VoIP) software are among the most widely spread and pervasive software, counting millions of monthly users. However, we argue that people ignore the drawbacks of transmitting information along with their voice, such as keystroke sounds—as such sound can reveal what someone is typing on a keyboard. In this article, we present and assess a new keyboard acoustic eavesdropping attack that involves VoIP, called Skype & Type ( S&T ). Unlike previous attacks, S&T assumes a weak adversary model that is very practical in many real-world settings. Indeed, S&T is very feasible, as it does not require (i) the attacker to be physically close to the victim (either in person or with a recording device) and (ii) precise profiling of the victim’s typing style and keyboard; moreover, it can work with a very small amount of leaked keystrokes. We observe that leakage of keystrokes during a VoIP call is likely, as people often “multi-task” during such calls. As expected, VoIP software acquires and faithfully transmits all sounds, including emanations of pressed keystrokes, which can include passwords and other sensitive information. We show that one very popular VoIP software (Skype) conveys enough audio information to reconstruct the victim’s input—keystrokes typed on the remote keyboard. Our results demonstrate that, given some knowledge on the victim’s typing style and keyboard model, the attacker attains top-5 accuracy of 91.7% in guessing a random key pressed by the victim. This work extends previous results on S&T , demonstrating that our attack is effective with many different recording devices (such as laptop microphones, headset microphones, and smartphones located in proximity of the target keyboard), diverse typing styles and speed, and is particularly threatening when the victim is typing in a known language.
Read full abstract