Abstract

Machine learning has largely applied to malware detection and classification, due to the ineffectiveness of signature-based method toward rapid malware proliferation. Although state-of-the-art machine learning models tend to achieve high performances, they require a large number of training samples. It is infeasible to train machine learning models with sufficient malware samples while facing newly appeared malware variants. Therefore, it is important for security protectors to train a model given a small set of data, which can identify malware variants based on the similarity function. In addition, security protectors should keep re-training the models on newly-found samples, while the typical machine learning models based on massive data are not efficient for the instant update. Inspired by recent success using Siamese neural networks for one-shot image recognition, we aim to apply the networks to malware image classification task. The implementation includes three main stages: pre-processing, training, and testing. In the pre-processing stage, the system transforms malware samples to the resized gray-scale images and classifies them by average hash in the same family. In the training and testing stages, Siamese networks are trained to rank similarity between samples and the accuracy is calculated through N-way one-shot tasks. The experiment results showed that our networks outperformed the baseline methods. Besides, this paper indicated that our networks were more suitable for malware image one-shot learning than typical deep learning models.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.