Enhanced Profile Hidden Markov Model for Metamorphic Malware Detection

Allyza Maureen P Catura,Jonathan C Morano,Ken Carlo D Javier,Mark Christopher R Blanco

doi:10.38124/ijisrt/ijisrt24mar2052

Allyza Maureen P Catura, Jonathan C Morano + Show 2 more

Open Access

https://doi.org/10.38124/ijisrt/ijisrt24mar2052

Copy DOI

Abstract

Metamorphic malware poses a significant threat to conventional signature-based malware detection since its signature is mutable. Multiple copies can be created from metamorphic malware. As such, signature- based malware detection is impractical and ineffective. Thus, research in recent years has focused on applying machine learning-based approaches to malware detection. Profile Hidden Markov Model is a probabilistic model that uses multiple sequence alignments and a position-based scoring system. An enhanced Profile Hidden Markov Model was constructed with the following modifications: n-gram analysis to determine the best length of n-gram for the dataset, setting frequency threshold to determine which n-gram opcodes will be included in the malware detection, and adding consensus sequences to multiple sequence alignments. 1000 malware executables files and 40 benign executable files were utilized in the study. Results show that n-gram analysis and adding consensus sequence help increase malware detection accuracy. Moreover, setting the frequency threshold based on the average TF-IDF of n-gram opcodes gives the best accuracy in most malware families than just by getting the top 36 most occurring n-grams, as done in previous studies.

Full Text