Abstract

Cybercrime is a kind of criminal activity generally committed by cybercriminals or hackers. Crime activities are growing explosively all over the world which motivates the law enforcement agencies for systematic analysis of crimes. In many cases, crime information is stored as online text reports in an unstructured way and one report describes several different criminal activities. Analysis of these crime reports for identifying patterns and trends in crime and devising solutions to crime detection and prevention strategies are very challenging tasks. In this paper, the crime reports are preprocessed and relations among named entity pairs are extracted to give the structured form to the reports. Each extracted relation is converted to an n-dimensional real-valued vector based on the concept of Word2Vec model of Natural Language Processing. Then a novel agglomerative graph partitioning algorithm using various graph centrality measures is applied to partition the extracted relations. All the extracted relations of a report which are in a single partition are replaced by the representative of that partition and thus each report is described by a set of distinct types of relations. Next, a graph for the set of reports is constructed in such a way that nodes are corresponding to the tuple of relations that describes the reports, and an edge between a pair of nodes is drawn only if the corresponding pair of relations are of a similar type of two different reports. The constructed graph is a disconnected graph with each connected component is a clique. These cliques are easily identified in linear time of the number of edges in the graph and each clique provides a cluster of reports. As each report is described by a set of relations of different types, so obtained clusters are overlapping clusters. The degree of membership of a report in a cluster is also identified in the paper. The proposed method is experimented, and compared with some state-of-the-art partition-based and overlapping clustering algorithms to demonstrate its effectiveness in the domain of crime corpora.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.