Investigating optimal features in log files for anomaly detection using optimization approach

Shivaprakash Ranga,Nageswara Guptha Mohankumar

doi:10.11591/ijai.v13.i1.pp287-295

Shivaprakash Ranga, Nageswara Guptha Mohankumar

Open Access

https://doi.org/10.11591/ijai.v13.i1.pp287-295

Copy DOI

Abstract

<p><span lang="EN-US">Logs have been frequently utilised in different software system administration activities. The number of logs has risen dramatically due to the vast scope and complexity of current software systems. A lot of research has been done on log-based anomaly identification using machine learning approach. In this paper, we proposed an optimization approach to select the optimal features from the logs. This will provide the higher classification accuracy on reduced log files. In order to predict the anomalies three phases are used: i) log representation ii) feature selection and iii) Performance evaluation. The efficacy of the proposed model is evaluated using benchmark datasets such as BlueGene/L (BGL), Thunderbird, spirit and hadoop distributed file system (HDFS) in terms of accuracy, converging ability, train and test accuracy, receiver operating characteristic (ROC) measures, precision, recall and F1-score. The results shows that the feature selection on log files outperforms in terms all the evaluation measures.</span></p>

Full Text