Conventionally, the Internet consists of three parts: Surface, Deep, and Dark Webs. Inthe last two decades, a massive increase in illicit activities took place on the differentplatforms of the Dark Web. Moreover, social networks on Dark Web implicateextremism dissemination on a wide scale. In this paper, we propose an approach togenerate textual patterns from discussions on Dark Web terrorist forums employingData Mining techniques. The discovered patterns help identify the influential membersand extract critical topics. We describe our system modules that perform datapreprocessing, text preprocessing with TF-IDF weighting, outlier detection, clusteringevaluation, clustering, and clustering validation, implemented with the RapidMinertool. We apply K-Means as the Clustering method with different distance metrics,evaluate the clustering process using Elbow and Silhouette methods, and validate itusing Davies-Bouldin Index. Furthermore, we investigate the effects of altering thedistance metrics for outlier detection on the Clustering results.
Read full abstract