Subject of Research.The paper proposes an approach to the use of gradient boosted decision trees algorithm. For this purpose, CatBoost algorithm developed by Yandex is proposed. Its implementation is aimed at the problem solution of OS Linux software identification in order to reduce the number of system vulnerabilities, which occur due to the installation of unauthorized software by automated system users. We consider an approach to the program signatures formation and further training of CatBoostClassifier classifier model. The subsequent recognition task is set for the identified programs that were not previously involved in the model training process. Method. Free CatBoost software was used for implementation of the gradient boosted decision trees algorithm. CatBoostClassifier multi-classification model was created on its basis. The use of this model allows identifying test sample elf-files.Main Results. The training parameters of the classification model are selected. An experiment is carried out to identify elf-files with the use of ten different featuresof emerging signature programs. The results obtained in the new approach are compared with the results of the previously developed method of identification based on the application of the statistical criterion of Chi-square homogeneity at the significance level p = 0.01. Practical Relevance. The results of the study can be recommended to information security specialists for data media audit. The developed approach gives the possibility to identify violations of the established security policy in the processing of confidential information.
Read full abstract