Abstract

The increasing amount and complexity of Cyber security attacks in recent years have made text analysis and data mining techniques an important factor in discovering features of such attacks and detecting future security threats. In this paper, we report on the results of a recent case study that involved the analysis of a community data set collected from five small and medium companies in Korea. The data set represents Cyber security incidents and response actions. We investigated in the study the kind of problems concerned with the prediction of response actions to future incidents from features of past incidents. Our analysis is based on text mining methods, such as n-gram and bag-of-words, as well as on machine learning algorithms for the classification of incidents and their response actions. Based on the results of the study, we also suggest an experience-sharing model, which we use to demonstrate how companies may share their trained classifiers without the sharing of their individual data sets in a collaborative environment.

Highlights

  • The increasing amount and complexity of Cyber security attacks in recent years have brought data mining techniques into the attention of researchers and experts as an important technique in detecting such attacks through the analysis of data and side effects left by malware and spyware programs and incidents of network and host intrusions

  • Text analysis and mining is widely used in many Cyber security areas, such as malware detection and classification (e.g. [14,15,16,18,22,24,27,29,33]) and malicious code detection (e.g. [7,30,31])

  • We present the results of the application of each of the four machine learning algorithms to each of the five companies data sets, in order to predict the outcome of the questions we posed in the previous section

Read more

Summary

Introduction

The increasing amount and complexity of Cyber security attacks in recent years have brought data mining techniques into the attention of researchers and experts as an important technique in detecting such attacks through the analysis of data and side effects left by malware and spyware programs and incidents of network and host intrusions. Text mining and analysis has been used for predicting links [8] and detecting leaks of confidential data [28], the unintentional sharing of private health information [17], as well as in classical areas such as digital forensics [12], electronic communications analysis [36] and Web text analysis [19] Despite all this popularity, the Cyber security community has remained somehow reluctant in adopting an open approach to security-related data due to many factors. In recent times, this trend has started to turn with the arrival of large and open security data sets and data-sharing platforms backed by the reliability and reputation of well-established organisations in the area of Cyber security Examples of these include VCDB [34], CERT’s Knowledgebase at Carnegie Mellon University [10], SecRepo [26], CAIDA [9] and others. More national-level hubs of Cyber security have been set up all over the world for the same purposes

Objectives
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.