БИНАРНАЯ КЛАССИФИКАЦИЯ МНОГОАТРИБУТНЫХ РАЗМЕЧЕННЫХ АНОМАЛЬНЫХ СОБЫТИЙ КОМПЬЮТЕРНЫХ СИСТЕМ С ПОМОЩЬЮ АЛГОРИТМА SVDD

Oleg I Sheluhin,Dmitriy I Rakovskiy

doi:10.36724/2409-5419-2021-13-2-74-84

Abstract

Introduction: At present, the volume of system logs of computer systems integrated into a distributed network infrastructure makes it impossible to manually check them in real time. Typically, the structure of each log record contains the numeric value of the observed attribute and a corresponding flag to mark the record as normal or abnormal. The support vector data description algorithm demonstrates high classification accuracy even with small volumes of the training sample. A feature of the algorithm is the work with a multi-attribute dataset, where each observation contains a common classifying marking. Consequently, the problem arises of reducing the set of markings of the attributes of the initial data to one marking of the entire observation. Purpose: to investigate the accuracy of the binary classification of experimental data of the Support Vector Data Description algorithm with a small volume of the training sample, provided that the data are labeled for each attribute separately. Methods: a method is proposed for solving the problem of reducing the set of markings of the attributes of the initial data to one single marking of the entire observation by means of two approaches: "normal observation" and voting by the majority principle. Two types of data are considered: ordered in time and uniformly mixed. The classification accuracy was assessed by calculating the area under the ROC curves with cross-validation for a different number of attributes. Results: a comparative analysis of observation labeling methods showed the advantage of the "completely normal observation" approach over the "majority vote" approach without "weighting". It is shown that the classification accuracy on mixed data is 7% higher compared to the variant of data ordering in time. The accuracy of the algorithm was investigated for a different number of attributes using the "completely normal observation" approach. The maximum achieved classification accuracy was about 96% when working with 6 attributes, with uniform mixing of the input dataset. A further increase in the number of attributes leads to a decrease in the average classification accuracy due to an increase in the proportion of anomalous observations. It is shown that when using uniform mixing of input data, the gain in accuracy can be increased by 15–20%. Practical relevance: the algorithm demonstrates an exponential growth in the consumption of computing resources with an increase in the amount of input data. Discussion: to achieve the maximum classification accuracy with acceptable resource consumption, it is necessary to form a compact set of input data, which most fully reflects the functioning of the computer system in normal mode.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

БИНАРНАЯ КЛАССИФИКАЦИЯ МНОГОАТРИБУТНЫХ РАЗМЕЧЕННЫХ АНОМАЛЬНЫХ СОБЫТИЙ КОМПЬЮТЕРНЫХ СИСТЕМ С ПОМОЩЬЮ АЛГОРИТМА SVDD

Abstract

Talk to us

Similar Papers

More From: H&ES Research

Lead the way for us

Journal: H&ES Research	Publication Date: Jan 1, 2021
Citations: 1

Similar Papers

Thruster Fault Identification for Autonomous Underwater Vehicle Based on Time-Domain Energy and Time-Frequency Entropy of Fusion Signal
Baoji Yin ... Xi Lin
-
Baoji Yin, et. al.Baoji Yin ... Xi Lin
01 Jan 2019
01 Jan 2019

A novel method based on physicochemical properties of amino acids and one class classification algorithm for disease gene identification
Abdulaziz Yousef ... Nasrollah Moghadam Charkari
Journal of Biomedical Informatics | VOL. 56
Abdulaziz Yousef, et. al.Abdulaziz Yousef ... Nasrollah Moghadam Charkari
02 Jul 2015
Journal of Biomedical Informatics | VOL. 56

Effectiveness of Support Vector Machines in Medical Data mining
Padmavathi Janardhanan ... Fathima Sabika
Journal of Communications Software and Systems | VOL. 11
Padmavathi Janardhanan, et. al.Padmavathi Janardhanan ... Fathima Sabika
23 Mar 2015
Journal of Communications Software and Systems | VOL. 11

An empirical comparison of different approaches for combining multimodal neuroimaging data with support vector machine.
William Pettersson-Yeo ... Paul Allen
Frontiers in neuroscience | VOL. 8
William Pettersson-Yeo, et. al.William Pettersson-Yeo ... Paul Allen
15 Jul 2014
Frontiers in neuroscience | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

БИНАРНАЯ КЛАССИФИКАЦИЯ МНОГОАТРИБУТНЫХ РАЗМЕЧЕННЫХ АНОМАЛЬНЫХ СОБЫТИЙ КОМПЬЮТЕРНЫХ СИСТЕМ С ПОМОЩЬЮ АЛГОРИТМА SVDD

Abstract

Talk to us

Similar Papers

More From: H&amp;ES Research

More From: H&ES Research