Abstract
Nowadays, cybercriminals tend to leverage dynamic malicious infrastructures with multiple servers to conduct attacks, such as malware distribution and control. Compared with a single server, employing multiple servers allows crimes to be more efficient and stealthy. As the necessary role infrastructures play, many approaches have been proposed to detect malicious servers. However, many existing methods typically target only on the individual server and therefore fail to reveal inter-server connections of an attack campaign.In this paper, we propose a complementary system, deMSF, to identify server flocks, which are formed by infrastructures involved in the same malicious campaign. Our solution first acquires server flocks by mining relations of servers from both spatial and temporal dimensions. Further we extract the semantic vectors of servers based on word2vec and build a textCNN-based flocks classifier to recognize malicious flocks. We evaluate deMSF with real-world traffic collected from an ISP network. The result shows that it has a high precision of 99% with 90% recall.
Highlights
Malicious web activity is still a major threat to Internet
We find that a technique named word embedding in natural language processing (NLP) is very helpful for learning features of servers
We extract top 10 servers similar to it according to the semantic vectors and manually check whether they are similar in practical world
Summary
Malicious web activity is still a major threat to Internet. Nowadays, cybercriminals build malicious web infrastructures to supply their crimes, which makes attacks complicated and diversified. Most detection systems detect malicious webs by analyzing web content [5,11,16,24], identify malicious servers by building a reputation system for an individual server [3,4,9] or find popular techniques adversaries used to avoid evasion [18,25,30,32, 33] These works focus only on a single server which makes them lack the panoramic view of attacks. Word2vec takes text corpus as input and generates word vectors It includes two learning models, Continuous Bag of Words (CBOW) and Skip-gram. Another significant advantage of word2vec is that words with similar meanings will be mapped to similar positions in the vector space
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.