Abstract

AbstractRecent developments in the field of Internet of things (IoT) have aroused growing attention to the security of smart devices. Specifically, there is an increasing number of malicious software (Malware) on IoT systems. Nowadays, researchers have made many efforts concerning supervised machine learning methods to identify malicious attacks. High‐quality labels are of great importance for supervised machine learning, but noises widely exist due to the non‐deterministic production environment. Therefore, learning from noisy labels is significant for machine learning‐enabled Malware identification. In this study, motivated by the symmetric cross entropy with satisfactory noise robustness, the authors propose a robust Malware identification method using temporal convolutional network (TCN). Moreover, word embedding techniques are generally utilised to understand the contextual relationship between the input operation code (opcode) and application programming interface function names. Here, considering the numerous unlabelled samples in real‐world intelligent environments, the authors pre‐train the TCN model on an unlabelled set using a word embedding method, that is, Word2Vec. In the experiments, the proposed method is compared with several traditional statistical methods and more recent neural networks on a synthetic Malware dataset and a real‐world dataset. The performance comparisons demonstrate the better performance and noise robustness of their proposed method, especially that the proposed method can yield the best identification accuracy of 98.75% in real‐world scenarios.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call