Abstract

As an important part of malware detection and classification, sequence-based analysis can be integrated into dynamic detection system for real-time detection. This work presents a novel learning method for malware detection models that leverages advances in graph embedding for fusing the n-gram data into a one-hot feature space with different transmission directions. By capturing the information flow, our method finds a better feature representation for detection tasks with rely solely on sequence information. To enhance the stability of feature representation, this work adopts a multi-task learning strategy which achieves better performance in independent testing. We evaluate our method on two different realworld datasets and compare it against four superior malware detection models. During malware detection using our method, we conducted in-depth discussions on feature length, graph embedding direction, model depth, and different multi-task learning strategies. Experimental and discussion results show that our method significantly outperforms alternative approaches across evaluation settings.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call