Abstract

Honeywords are fictitious passwords inserted into databases in order to identify password breaches. The major challenge is producing honeywords that are difficult to distinguish from real passwords. Although the generation of honeywords has been widely investigated in the past, the majority of existing research assumes attackers have no knowledge of the users. These honeyword generating techniques (HGTs) may utterly fail if attackers exploit users’ personal identifiable information (PII) and the real passwords include users’ PII. The literature has demonstrated that password guessing is more effective when focusing on each of the chunks that compose a password (e.g., “P@ssword123” contains two chunks: “P@ssword” and “123”) and it has been suggested that, when available, PII should be used to generate honeywords. We thus leverage these findings to base our HGT method on any possible PII contained within passwords, and introduce a new, and more robust than its literature counterparts, method to generate honeywords, which consists of generating honeywords with GPT-3 using the semantic chunks of their corresponding real passwords. Furthermore, we propose a new metric, HWSimilarity, to evaluate the capability of HGTs. HWSimilarity is a pre-trained language model-based similarity metric that considers the semantic meaning of passwords when measuring the indistinguishability of honeywords and their counterparts. Comparing our chunk-level GPT-3 HGT to two state-of-the-art HGTs and using GPT-3 alone, we show that our HGT can generate honeywords that are more indistinguishable than its counterparts.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.