Abstract

The rapid development of malicious software programs has posed severe threats to Computer and Internet security. Therefore, it motivates anti-malware vendors and researchers to develop novel methods which are capable of protecting users against new threats. Existing malware detectors mostly treat the file samples separately using supervised learning algorithms. However, ignoring the relationship among file samples limits the capability of malware detectors. In this paper, based on the file-to-file social network, we present a new malware detection framework, FindMal(File-to-File Social Network basedMalware Detection Framework), including graph-based features extraction, Label Propagation algorithm, and active learning strategy. Nearest neighbors are first chosen as adjacent nodes for each file node to construct kNN file relation graph. Three file relation graph features are proposed to sample the representative file samples for labeling. Then, Label Propagation algorithm, which propagates the label information from labeled file samples to unlabeled files, is applied to learn the probability that one unknown file is classified as malicious or benign. A batch mode active learning method is employed to reduce the labeling cost and improve the performance of Label Propagation. Comprehensive experiments on real and large scale dataset obtained from an anti-malware company are performed. The results demonstrate that our proposed FindMal outperforms other existing detection models in classifying file samples.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call