Abstract
With the rapid popularity of the Internet, a large amount of new malware is produced every day, while the traditional signature based malware detection algorithm is unable to detect such unseen malware. In recent years, many machine learning based algorithms have been proposed to detect new malware, and several of these algorithms are able to achieve quite good detection performance when supplied with plenty of training data. However, most of these algorithms just focus on how to improve the classification performance, while the robustness is not taken into consideration. This paper performs a detailed analysis on the robustness of four well-known machine learning based malware detection approaches, i.e. the DLL and API feature, the string feature, PE-Miner and the byte level N-Gram feature. We proposed two pretense approaches under which malware is able to pretend to be benign and bypass the detection algorithms. Experimental results show that the performances of these detection algorithms decline greatly under the pretense approaches. The lack of robustness makes these algorithms unable to be used in real world applications. In future works of machine learning based malware detection, researchers have to take the problem of robustness seriously.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.