Abstract

A textual password is widely used for user authentication for a variety of applications. Passwords that are easy to remember are also easy to be guessed, while complex and long passwords that provide strong security are difficult to remember. Also, there has been limited quantitative research to understand the factors that make passwords strong. In this research, we aim to expand our understanding of passwords through the lenses of data-driven analysis by characterizing a large number of password datasets with four different hypotheses. In particular, we use the tensor decomposition method that is effective in analyzing unlabeled high dimensional data. We first obtain 362,805 passwords from four different leaked password datasets. Next, we generate syntactic and semantic features for each password, then classify it into three strength groups using a statistical guessing attack model. Finally, we construct a 3rd-order password tensor and decompose it using the PARAFAC2 algorithm to examine the main characteristics which make passwords strong. Also, we apply an orthogonal constraint to the component matrix to mitigate the uniqueness problem. For the optimal rank and constraint selection, we compare three types of constraints in terms of the computational time, reconstruction ratio, and Corcondia score. With various statistical and tensor decomposition analyses, we find dominant factors that influence on a strong password. In addition, we extend our tensor decomposition-based model for strength retrieval when a new password needs to be evaluated. This strength retrieval model can estimate the strength of the new password input quickly and provide recommendations to strengthen the password. We hope that our model based on data science perspective can validate widely accepted password composition policy and suggestion methods, and further provide insights to designing better password suggestion systems and password composition policies.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call