Advanced Cybercrime Detection: A Comprehensive Study on Supervised and Unsupervised Machine Learning Approaches Using Real-world Datasets

Duc M Cao Duc M Cao,Bishnu Padh Ghosh Bishnu Padh Ghosh,Aslima Akter Aslima Akter,Mamunur Rahman Mamunur Rahman,Md Abu Sayed Md Abu Sayed,Rejon Kumar Ray Rejon Kumar Ray,Aqib Raihan Aqib Raihan,Md Tuhin Mia Md Tuhin Mia,Eftekhar Hossain Ayon Eftekhar Hossain Ayon

doi:10.32996/jcsts.2024.6.1.5

Abstract

In the ever-evolving field of cybersecurity, sophisticated methods—which combine supervised and unsupervised approaches—are used to tackle cybercrime. Strong supervised tools include Support Vector Machines (SVM) and K-Nearest Neighbors (KNN), while well-known unsupervised methods include the K-means clustering model. These techniques are used on the publicly available StatLine dataset from CBS, which is a large dataset that includes the individual attributes of one thousand crime victims. Performance analysis shows the remarkable 91% accuracy of SVM in supervised classification by examining the differences between training and testing data. K-Nearest Neighbors (KNN) models are quite good in the unsupervised arena; their accuracy in detecting criminal activity is impressive, at 79.56%. Strong assessment metrics, such as False Positive (FP), True Negative (TN), False Negative (FN), False Positive (TP), and False Alarm Rate (FAR), Detection Rate (DR), Accuracy (ACC), Recall, Precision, Specificity, Sensitivity, and Fowlkes–Mallow's scores, provide a comprehensive assessment.

Full Text