Edge-Based Detection and Classification of Malicious Contents in Tor Darknet Using Machine Learning

Runchuan Li,Jiawei Yang,Shuhong Chen,Entao Luo

doi:10.1155/2021/8072779

Runchuan Li, Jiawei Yang + Show 2 more

Open Access

https://doi.org/10.1155/2021/8072779

Copy DOI

Abstract

With the increase of data in the network, the load of servers and communication links becomes heavier and heavier. Edge computing can alleviate this problem. Due to a sea of malicious contents in Darknet, it is of high research value to combine edge computing with content detection and analysis. Therefore, this paper illustrates an intelligent classification system based on machine learning and Scrapy that can detect and judge fleetly categories of services with malicious contents. Because of the nondisclosure and short survival time of Tor Darknet domain names, obtaining uniform resource locators (URLs) and resources of the network is challenging. In this paper, we focus on a network based on the Onion Router (tor) anonymous communication system. We designed a crawler program to obtain the contents of the Tor network and label them into six classes. We also construct a dataset which contains URLs, categories, and keywords. Edge computing is used to judge the category of websites. The accuracy of the classifier based on a machine learning algorithm is as high as 89%. The classifier will be used in an operational system which can help researchers quickly obtain malicious contents and categorize hidden services.

Highlights

Introduction eDarknet has a huge amount of data
In Tor Darknet, a domain name’s complete format is “[digest].onion,” which is made up of two parts: the first [digest] is a random string of numbers mixed with English, and the second is a uniform suffix of Tor links, jsaljfslj4sfd5ad.onion, for example
It will not show any results when we search sites with the suffix “.onion.” erefore, in order to classify the contents of Tor Darknet, domain names need to be obtained in various ways

Summary

Proposed Model for Tor Darknet Resource Detection in Edge Computing

Erefore, web page content should be detected at edge devices, and the original data should be processed into distinctive words that best describe the website category. In Tor Darknet, a domain name’s complete format is “[digest].onion,” which is made up of two parts: the first [digest] is a random string of numbers mixed with English, and the second is a uniform suffix of Tor links, jsaljfslj4sfd5ad.onion, for example It will not show any results when we search sites with the suffix “.onion.” erefore, in order to classify the contents of Tor Darknet, domain names need to be obtained in various ways. After the text content of each web page is cleaned, the corpus is integrated into a dataset containing URLs, categories, and key words. A machine learning algorithm, KNN, is applied to such samples for the purpose of training a classifier in subsequent experiments

Classification Model

Experimental Analysis

E10 Figure 8

Method

Findings

Conclusions

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Mobile Information Systems	Publication Date: Nov 22, 2021
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Edge-Based Detection and Classification of Malicious Contents in Tor Darknet Using Machine Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mobile Information Systems

Lead the way for us

Similar Papers

Malevolent WEBSITE identification using Machine Learning Techniques
Jesy Janet Kumari
International Journal of Innovative Research in Information Security | VOL. 09
Jesy Janet KumariJesy Janet Kumari
23 Jun 2023
International Journal of Innovative Research in Information Security | VOL. 09

Malicious URL Detection based on Machine Learning
Cho Do Xuan ... Hoa Dinh
International Journal of Advanced Computer Science and Applications | VOL. 11
Cho Do Xuan, et. al.Cho Do Xuan ... Hoa Dinh
01 Jan 2020
International Journal of Advanced Computer Science and Applications | VOL. 11

Digital Fortress - Web Application Malware Detection
P V Kishore Kumar ... K Vamsi
International Research Journal on Advanced Engineering and Management (IRJAEM) | VOL. 6
P V Kishore Kumar, et. al. P V Kishore Kumar ... K Vamsi
23 Jul 2024
International Research Journal on Advanced Engineering and Management (IRJAEM) | VOL. 6

Malicious URL Detection An Evaluation of Feature Extraction and Machine Learning Algorithm
Yichen Wang
Highlights in Science, Engineering and Technology | VOL. 23
Yichen WangYichen Wang
03 Dec 2022
Highlights in Science, Engineering and Technology | VOL. 23

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Edge-Based Detection and Classification of Malicious Contents in Tor Darknet Using Machine Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mobile Information Systems