In recent years, there has been a noticeable trend toward targeted threats to information security, where companies are now leveraging vulnerabilities and risks associated with widely used services in order to generate financial gain. Additionally, they implement numerous precautions and consistently carry out their tasks. One item that requires precautionary measures is the network devices utilized. Network devices in computer networks possess the capability to log events. These logs enable the identification of security events on the network and facilitate the implementation of precautionary measures. Various security measures can be implemented to handle such data. One of these measures is Security Information and Event Management (SIEM). It is a system that gathers and analyzes data from networks and security devices. SIEM is a technique employed to consolidate critical information within a cohesive structure. It allows for the correlation of events from different security devices, thereby improving the monitoring capabilities of cybersecurity operations centers. This study extensively covers the critical infrastructure-SIEM relationship, current studies, critical infrastructure, cyber security policies, and SIEM. Our system design was developed using the UNSW\_NB15 dataset, a widely recognized dataset in cybersecurity due to its comprehensive and realistic representation of cyber threats. This dataset consists of data obtained from network traffic, various attack activities, and real-life modern normal scenarios, making it particularly relevant to our study. With the studies, a total of 10 different categories were analyzed, with the category consisting of nine types of attacks, namely Analysis, Backdoor, DoS, Exploits, Fuzzers, Generic, Reconnaissance, Shellcode, and Worms and Normal activities. The study is divided into two as the basic structure. The first step was carried out on Google Collaboratory, and then some experimental studies were carried out in Weka. Classifications were made using several methods, including Logistic Regression (LR), Extra Trees (XT), Support Vector Machines (SVM), Random Forest (RF), and Decision Trees (DT). These methods were chosen for their proven effectiveness in similar studies. In the application developed with Google Colabratory, we achieved 98.62\% in Random Forest, 99.10\% in Decision Trees, 98.87\% in Logistic Regression, 95.13\% success in Extra Trees and 99.12\% success in Support Vector Machines. As a result of the studies and experiments carried out in Weka, we achieved 92.05\% in Random Forest, 100\% in Decision Trees, 100\% in k-Nearest Neighbours, 100\% in J48, 99.19\% in Naive-Bayes and 99.35\% in BayesNet achievements.
Read full abstract