Physical locks are one of the most prevalent mechanisms for securing objects such as doors. While many of these locks are vulnerable to lock-picking, they are still widely used as lock-picking requires specific training with tailored instruments and easily raises suspicion. To overcome the limitations of lockpicking, we propose a novel attack vector that leverages the audio recording of the key insertion in order to infer the shape of victim’s key, namely, bittings (or cut depths) which form the secret of a key. In particular, we show that computing the timing interval between audible click sounds that occur during key insertion enables inferring the bitting information, i.e., shape of the physical key. Such an audio-based attack has several advantages—unlike lock-picking, it minimizes the attacker’s physical access to the lock, thus reducing the risks of them being apprehended. Second, as the attack only requires a microphone to launch the attack (e.g., a smartphone microphone), it significantly lowers the bar for the required expertise of the attacker. Despite the advantages, there are several challenges in extracting the required key-related signal from the audio. In this talk, we will discuss how we overcome the challenges and present initial results depicting the feasibility of audio-based key inference. This talk is based on two conference papers submitted to ACM HotMobile 2020 and USENIX Security 2021 on the same topic.