Abstract

In recent years, computing devices have become widely distributed, and the accumulated data from these devices are growing rapidly, especially as they are increasingly equipped with various sensors and RF communication capabilities. Data science, including machine learning technology, has contributed to the better handling the large amounts of data and feature selection techniques have been a useful strategy. As data amounts continue to grow, scaling features will became crucial in data science. In this paper, we propose a novel filter-based feature-selection method in the context of keystroke dynamics authentication. In particular, we propose a new feature-scoring method and apply it to keystroke-dynamics-based authentications. We implement keystroke-dynamics-based authentications multi-factored with PIN-based authentications and collect data from actual users’ testing experiments. Then, we apply our feature-selection method and compare the performance with that when using all of the features and existing feature-selection methods. Our experimental results show that the classification performance by the proposed method is superior to those of the other methods by up to 21.8%. Moreover, our method provides security to other users’ data sets, as the method utilizes only mean values from imposter data. Our feature-selection method contributes to improving the quality of keystroke dynamics authentications without user privacy issues. More generally, our method can also be applied to other data-mining data sets, such as IoT sensor data sets.

Highlights

  • Data Science refers to all the procedures and techniques related to data, from collecting data to producing meaningful information with it

  • Data science for the Internet of Things (IoT) involves large amounts of data, as downsized computing devices such as smartphones and IoT devices are widely distributed and large amounts of data have rapidly been collected by embedded sensors

  • Learning slows and it becomes difficult to obtain appropriate results. This problem is called the curse of dimensionality [4], and many studies concentrating on lowering the dimensionality of data and extracting features representing the characteristics of collected data have been actively carried out in an effort to solve this problem [5]–[10] with regard to feature selection

Read more

Summary

Introduction

Data Science refers to all the procedures and techniques related to data, from collecting data to producing meaningful information with it. Data science for the Internet of Things (IoT) involves large amounts of data, as downsized computing devices such as smartphones and IoT devices are widely distributed and large amounts of data have rapidly been collected by embedded sensors. Learning slows and it becomes difficult to obtain appropriate results. This problem is called the curse of dimensionality [4], and many studies concentrating on lowering the dimensionality of data and extracting features representing the characteristics of collected data have been actively carried out in an effort to solve this problem [5]–[10] with regard to feature selection

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.