Abstract
This paper explores an approach to identify the maximal cluster of hosts based on the proposed data fusion and clustering algorithms. The data fusion algorithm works with the cross-testing F1-measure matrices of three supervised machine learning algorithms to identify the similarity of the hosts with regards to intrusion detection. After identifying the hosts similarities, the clustering algorithm is developed to identify the maximal cluster. Through applying the experimental data set, we have identified a maximal cluster which consists of 5 hosts out of 16 hosts in a network. This maximal cluster identification based on the data fusion and machine learning algorithms can detect the similar anomaly behaviors generated by the same hacking mechanism to multiple machines in a network at the same time period. Furthermore, integrating the generated learning models from multiple machines or clusters could generate a robust detection model without time consuming training process based on all the network flows.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have