Abstract

The security and integrity of computer systems and networks highly depend on malware detection. In the realm of malware detection, the K-Nearest Neighbors (KNN) algorithm is a well-liked and successful machine learning algorithm. However, the choice of an acceptable distance metric parameter has a significant impact on the KNN algorithm's performance. This study tries to improve malware detection by adjusting the KNN algorithm's distance metric parameter. The distance metric greatly influences the similarity or dissimilarity between instances in the feature space. The KNN algorithm for malware detection can be more accurate and effective by carefully choosing or modifying the distance metric. This paper analyzes multiple distance metrics, including Minkowski distance, Manhattan distance, and Euclidean distance. These metrics account for the traits of malware samples while capturing various aspects of similarity. The effectiveness of the KNN algorithm is evaluated using the MalMem-2022 malware dataset, and the results are broken down into these three-distance metrics. The experimental findings show that, among the three distance metric parameters, the Euclidean and Minkowski distance metric parameters considerably produced the best outcomes with binary classification. While with multiclass classification, the KNN algorithm has achieved the highest outcomes using Manhattan distance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call