The Homology Determination System for APT Samples Based on Gene Maps

Rui-Chao Xu,Yue-Bin Di,Zeng Shou,Xiao Ma,He-Qiu Chai,Long Yin

doi:10.13052/jcsm2245-1439.1348

Abstract

At present, there are fewer types of homology determination methods for advanced persistent threat (APT) samples detection, and most existing determination schemes have problems such as high cost, low accuracy, and difficulty in identifying unknown APT samples. Therefore, we proposed a homology determination system for APT samples based on gene maps by integrating deep learning and gene maps. Firstly, we extract the software gene features from the samples uploaded by the user and apply the TF-IDF algorithm to clean the extracted software genes. The Word2Vec algorithm is used to vectorize all the genes to construct the gene sample vectors. And we use a LSTM-based classifier to detect APT attack samples. Finally, the K-nearest neighbor algorithm is used to determine the homology of gene-sharing APT samples. The detailed construction process of the scheme is given in this paper, including APT sample gene extraction, cleaning, clustering, sample detection, and homology determination. Experimental validation showcases our model outperforming existing methodologies with an accuracy of 95%, precision of 94%, and recall of 95%. When compared to previous models, the superiority of our approach is evident. These results underscore our model’s high efficiency and accuracy, confirming its potential for significant application in the field of cybersecurity.

Full Text