Abstract
Although the anonymous communication network Tor can protect the security of users’ data and privacy during their visits to the Internet, it also facilitates illegal users to access illegal websites. Website fingerprinting attacks can identify the websites that users are visiting to discern whether they are performing illegal operations. Existing methods tend to manually extract the traffic features of users visiting websites and construct machine learning or deep learning models to classify the features. While these methods can be effective in classifying unknown website traffic, the effect of classification in the use of defensive measures or onion service scenarios is not yet ideal. This paper proposes a method to identify Tor users visiting websites based on frequency domain fingerprinting of network traffic (FDF). We extract the direction and length features of circuit sequences in access traffic and combine and transform them into the frequency domain. The classification of access traffic is accomplished by using a deep learning classification model combining CNN, FC, and Self-Attention. In this paper, the proposed FDF method is experimentally validated in common scenarios of Tor networks. The results show that FDF outperforms the existing methods for classification in different Tor scenarios. It can achieve 98.8% and 94.3% classification accuracy in undefended and WTF-PAD defense scenarios, respectively. In the onion service scenario, the accuracy is improved by 4.7% over the current state-of-the-art Tik-Tok method.
Highlights
As people’s awareness of protecting personal privacy continues to increase, more and more users are beginning to use anonymous communication systems to interact with the outside world
In a website fingerprint (WF) attack, the enforcer intercepts network traffic and extracts the features of the traffic packets in an encrypted connection between the monitored user and the entry node of Tor. e classifier determines whether the intercepted traffic is associated with the website of interest to the enforcer, and if the traffic matches the classifier, it indicates that the monitored user is visiting the website of interest to the enforcer. e WF attack allows the enforcer to determine whether the monitored user is browsing illegal
(3) We evaluated frequency domain fingerprinting of network traffic (FDF) in a more realistic open-world scenario, where we collected a dataset containing 40,000 unmonitored websites and achieved more desirable Precision and Recall in both undefended and WTF-PAD [21] environments, indicating that FDF is effective in real environments
Summary
As people’s awareness of protecting personal privacy continues to increase, more and more users are beginning to use anonymous communication systems to interact with the outside world. When users visit each website, they will generate different network traffic features, such as different numbers of data packets and different traffic burst patterns. The data processing method of network traffic and the choice of classifier have a significant impact on its attack efficiency. E method employs a K-Nearest Neighbor (K-NN) classifier that uses a combination of features to evaluate the similarity between different websites through a distance metric. In 2011, Panchenko et al [6] used the Herrmann et al dataset and employed an SVM classifier to classify Tor network traffic by various features such as packet traffic and time. This method is not effective in reducing the impact of noise on WF attacks. In 2016, Hayes et al [9] proposed a website fingerprinting attack K-FP based on random decision forest
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.