Abstract

Globalization influences the frequency and severity of localized traditional patterns of crime in addition to creating conditions for new types of crime. The scope and makeup of traditional crime have been significantly impacted by the world’s compression. This makes it impossible for law enforcement officials to examine crime reports for the purpose of conducting investigations and taking the necessary preventive measures. Clustering facilitates investigations by grouping crime reports into different categories based on the crime types. In the proposed work, the crime reports are preprocessed and embedded with the help of the BERT model to capture the contextual meaning of the reports. An undirected graph is constructed considering each crime report vector as a node where an edge exists between a pair of nodes only if the cosine similarity between them is more than a threshold value. The constructed graph is partitioned based on the concepts of node betweenness, edge connectivity, and two proposed concepts, namely, cut of a node, and safe node, introduced in the paper. Based on the edge connectivity of the graph, we decide whether the graph needs to be further partitioned or not. If edge connectivity is high, we consider the graph as a cluster; otherwise, we partition the graph with the help of a node, say v, with the highest node betweenness. During partition, we find the cut of v to determine whether v is safe or not. If v is safe, the algorithm provides two overlapping subgraphs of the graph; on the other hand, if it is not safe, then the algorithm provides a modified graph with a replica of node v. The algorithm is iterative in nature and terminates when all subgraphs are of higher edge connectivity. As the generated subgraphs are overlapped in nature, a novel graph theory-based fuzzification technique is utilized to measure the membership value of each node in different subgraphs, which allows investigators to focus on the most serious crimes in the reports. The proposed method is assessed using crime report datasets as well as other text datasets and compared with several state-of-the-art methods by utilizing a variety of performance metrics to reflect the effectiveness of the method in different domains.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call