Abstract

Researchers proposed several data-driven methods to efficiently guess user-chosen passwords for password strength metering or password recovery in the past decades. However, these methods are usually evaluated under ad hoc scenarios with limited data sets. Thus, this motivates us to conduct a systematic and comparative investigation with a very large-scale data corpus for such state-of-the-art cracking methods. In this paper, we present the large-scale empirical study on password-cracking methods proposed by the academic community since 2005, leveraging about 220 million plaintext passwords leaked from 12 popular websites during the past decade. Specifically, we conduct our empirical evaluation in two cracking scenarios, i.e., cracking under extensive-knowledge and limited-knowledge. The evaluation concludes that no cracking method may outperform others from all aspects in these offline scenarios. The actual cracking performance is determined by multiple factors, including the underlying model principle along with dataset attributes such as length and structure characteristics. Then, we perform further evaluation by analyzing the set of cracked passwords in each targeting dataset. We get some interesting observations that make sense of many cracking behaviors and come up with some suggestions on how to choose a more effective password-cracking method under these two offline cracking scenarios.

Highlights

  • Because of some irreplaceable advantages, such as low technical requirements and wide usage, textual passwords are likely to remain the most common authentication method for the near future [1]

  • We conduct a large-scale empirical study on password-cracking methods, including the latest one based on the neural network, leveraging about 220 million plaintext passwords leaked from 12 popular websites during the past decade

  • Limited-knowledge at this moment means the attacker only knows the name, regional, or language information of the target. e attacker does not know the exact distribution of target hashed passwords and can only use passwords from a different source. is represents one situation that the attacker wants to crack hashed passwords that have not been decrypted as plaintext before on this website

Read more

Summary

Introduction

Because of some irreplaceable advantages, such as low technical requirements and wide usage, textual passwords are likely to remain the most common authentication method for the near future [1]. Strong passwords are always hard to remember, so it is not surprising that users often create easy-to-guess passwords for convenience, which puts password-based authentication systems in a high-risk situation [2, 3]. Considering various attacks, offline cracking poses a serious threat and cannot be ignored [4]. Due to frequent password leakage incidents [5,6,7], the security risk caused by this attack is exacerbated. It is essential for password-based authentication systems to evaluate their resilience to offline cracking properly

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call