Understanding Offline Password-Cracking Methods: A Large-Scale Empirical Study

Ruixin Shi,Yong Li,Yongbin Zhou,Weili Han

doi:10.1155/2021/5563884

Abstract

Researchers proposed several data-driven methods to efficiently guess user-chosen passwords for password strength metering or password recovery in the past decades. However, these methods are usually evaluated under ad hoc scenarios with limited data sets. Thus, this motivates us to conduct a systematic and comparative investigation with a very large-scale data corpus for such state-of-the-art cracking methods. In this paper, we present the large-scale empirical study on password-cracking methods proposed by the academic community since 2005, leveraging about 220 million plaintext passwords leaked from 12 popular websites during the past decade. Specifically, we conduct our empirical evaluation in two cracking scenarios, i.e., cracking under extensive-knowledge and limited-knowledge. The evaluation concludes that no cracking method may outperform others from all aspects in these offline scenarios. The actual cracking performance is determined by multiple factors, including the underlying model principle along with dataset attributes such as length and structure characteristics. Then, we perform further evaluation by analyzing the set of cracked passwords in each targeting dataset. We get some interesting observations that make sense of many cracking behaviors and come up with some suggestions on how to choose a more effective password-cracking method under these two offline cracking scenarios.

Full Text