Recently, as the non-face-to-face society persists due to the coronavirus (COVID-19), the Internet usage rate continues to increase, and input devices, such as keyboards and mice, are mainly used to authenticate users in non-face-to-face environments. Due to the nature of the non-face-to-face environment, important personal data are processed, and since these personal data include authentication information, it is very important to protect them. As such, personal information, including authentication information, is entered mainly from the keyboard, and attackers use attack tools, such as keyloggers, to steal keyboard data in order to grab sensitive user information. Therefore, to prevent disclosure of sensitive keyboard input, various image-based user authentication technologies have emerged that allow sensitive information, such as authentication information, to be entered via mouse. To address mouse data stealing vulnerabilities via GetCursorPos() function or WM_INPUT message, which are representative mouse data attack techniques, a mouse data defense technique has emerged that prevents attackers from classifying real mouse data and fake mouse data by the defender generating fake mouse data. In this paper, we propose a mouse data attack technique using machine learning against a mouse data defense technique using the WM_INPUT message. The proposed technique uses machine learning models to classify fake mouse data and real mouse data in a scenario where the mouse data defense technique, utilizing the WM_INPUT message in image-based user authentication, is deployed. This approach is verified through experiments designed to assess its effectiveness in preventing the theft of real mouse data, which constitute the user’s authentication information. For verification purposes, a mouse data attack system was configured, and datasets for machine learning were established by collecting mouse data from the configured attack system. To enhance the performance of machine learning classification, evaluations were conducted based on data organized according to various machine learning models, datasets, features, and generation cycles. The results, highlighting the highest performance in terms of features and datasets were derived. If the mouse data attack technique proposed in this paper is used, attackers can potentially steal the user’s authentication information from various websites or services, including software, systems, and servers that rely on authentication information. It is anticipated that attackers may exploit the stolen authentication information for additional damages, such as voice phishing. In the future, we plan to conduct research on defense techniques aimed at securely protecting mouse data, even if the mouse data attack technique proposed in this paper is attempted.
Read full abstract