Abstract

Malware is at root of most of cyber-attacks, which has led to billions of dollars in damage every year. Most malware, especially Advanced Persistent Threat (APT) malware make use of Domain Name System (DNS) to control compromised machines and steal sensitive information. Therefore, several security products identified malware infection by combining machine learning technology with DNS data. However, the existing detection approaches cannot simultaneously identify both malicious domain names and infected hosts. To solve the problem, this work proposed a co-clustering based detection approach without labeled data, which integrates active DNS data with graph inference. According to active DNS data, a host-domain graph was generated in the first. Then partial domain nodes were labeled under the aid of blacklist, popular domain list, and Alexa ranking. At last, semi-supervised co-clustering was used to discover potential malicious domains and malware-infected hosts in the monitored network. This work implemented experiments in a network of hundreds of internal hosts that access 145 malware domains. Experimental results showed that the proposed detection approach was able to identify malware domains with up to 97.2% true positives. This work also compared and analyzed the results using different cluster calculating formulas with two different bipartite edge weights. Results showed that clustering with maximum and minimum edge weights has a better tolerance to different distance calculation methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.