Abstract

The appearance of malicious apps is a serious threat to the Android platform. In this paper, we propose an effective and automatic malware detection method using the text semantics of network traffic. In particular, we consider each HTTP flow generated by mobile apps as a text document, which can be processed by natural language processing (NLP) to extract text-level features. Later, the use of network traffic is used to create a useful malware detection model. We examine the traffic flow header using the N-gram method from the NLP. Then, we propose an automatic feature selection algorithm based on the Chi-square test to identify meaningful features. It is used to determine whether there is a significant association between the two variables. We propose a novel solution to perform malware detection using NLP methods by treating mobile traffic as documents. We apply an automatic feature selection algorithm based on the N-gram sequence to obtain meaningful features from the semantics of traffic flows. Our methods reveal some malware that can prevent the detection of antiviral scanners. In addition, we design a detection system to drive traffic to your own-institutional enterprise network, home network, and 3G/4G mobile network. Integrating the system connected to the computer to find suspicious network behaviors. Keywords: Semi supervised, machine, learning approach, detection, android platform.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call