Network security is one of the important issues that should be handled carefully and security to our data and important information is necessary when communicating with outside world over the internet. We have so many examples of the attacks that have been happened in the past and threatened a huge loss to the world. Over the years, a lot of work is done in this area and many security systems are developed, many security algorithms are implemented which have strong security fundamentals. Also, these systems have proven strong at the time of highly insecurity incidences or attacks. But, the problem is still there in identifying the malicious node entry in the network correctly without any false positives or false negatives, also within quick time before getting any type of access to the network by such malicious nodes which are also called as intruders. We need a system which should be operating faster even in heavy incoming network traffic scenarios, and generating correct identification of the intruders as well as giving easy, faster network access to non intruders or normal connecting nodes by correctly identifying them as normal nodes. In this research work, we are carrying out the implementation of the Intrusion detection System (IDS) model which performs faster as well as classifies the communicating nodes correctly as intruder or normal user. This IDS model is being implemented in two phases, hence a Two Phase Intrusion Detection System, abbreviated and named as TP-IDS model. We are using the machine learning techniques to develop this model, which makes this model a strongly secure IDS model and helping to better identify the known as well as unkown attack. We are using first phase for the identification and second phase for the validation of the first phase results of our model, which provides high accuracy. In first phase of the IDS we are using Support Vector Machine (SVM), k Nearest Neighbor (kNN) and in second phase we are using Decision Tree (DT), Naive Bayes (NB) for first phase validation. Also, the issue of efficiency is handled by underlying data processing infrastructure, which is HADOOP that increases efficiency or speed of the TP-IDS.
Read full abstract