Abstract

This paper presents a novel malware-detection model with a convolutional recurrent neural network using opcode sequences. Statistically, an executable file is considered as a set of consecutive machine codes. First, the theoretical foundation on which opcode sequences can be used to detect malware has been discussed. Next, an algorithm for extracting opcode sequences from executables and a deep learning-based malware-detection method that uses the opcode sequences as input have been presented. The proposed model comprises an opcode-level convolutional autoencoder that transforms a long opcode sequence to a relatively short compressed sequence at the front end and a dynamic recurrent neural network classifier that performs a prediction task using the codes generated by the opcode-level convolutional autoencoder at the rear end. Experimentally, the proposed model provided a malware-detection accuracy of 96%, receiver operating characteristic-area under the curve of 0.99, and true positive rate (TPR) of 95%. The highest accuracy and TPR achieved by existing malware-detection methods using opcode sequences were 97% and 82%, respectively. Compared with this method, the proposed model delivered a slightly lower accuracy of 96% but a considerably larger TPR of 95%. Therefore, the proposed model is capable of more reliable malware detection.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call