Abstract

Malware authors modify, reuse, tweak, share, and maintain code, libraries. It results in malware derivation, polymorphism leading to millions of malwares. Hence, there is need for automatic identification, categorization, and classification of various species and families of malware. Many machine learning techniques such as Decision tree, Support Vector Machine, Perceptron training, K-Nearest Neighbour, Neural network, Linear Regression, Logistic regression has been applied directly to identify and categorize malware without manual intervention. However, these were not efficient. Hence, new models have been used by various authors to apply machine learning techniques to improve efficiency in automatic detection and classification of malware. Here, we review few models used to identify and categorize malware using machine learning techniques. The models summarized are combination of two or more machine learning techniques, combination of classification and clustering, generation of malware instruction sets to create data sets for efficient processing of voluminous malware analysis reports, application of phylogeny concepts to malware evolution, derivation, and detection etc. Phylogeny is biological evolution, derivation of relationship between set of species. It is extended to classification and detection of malware as well.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call