GL‐WF: A lightweight graph learning model for website fingerprinting attacks

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Abstract Website fingerprinting attacks that analyze clients' browsing preferences are an important information source for maintaining cybersecurity and improving big data utilization in terms of service quality. Many scholars have studied graph learning because of its powerful learning and inferential capabilities on unstructured network traffic. However, the extensive computing resource requirements of graph neural networks limit their application to large‐scale and automated scenarios. In this paper, we propose a lightweight graph learning method for website fingerprinting attacks called GL‐WF. We designed a four‐level heterogeneous spatiotemporal graph using a multimodal representation of website browsing traffic. We then utilized a graph sampling algorithm to refine the structure of the graph. Finally, a well‐designed graph isomorphism network was used to extract the graph topology for the traffic classifiers. The proposed GL‐WF model, which was evaluated on over 3.6 million website flows, was found to provide high accuracy and efficiency.

Similar Papers
  • Research Article
  • Cite Count Icon 6
  • 10.1109/access.2023.3253559
A Survey on Deep Learning for Website Fingerprinting Attacks and Defenses
  • Jan 1, 2023
  • IEEE Access
  • Peidong Liu + 2 more

The attacks and defenses on the information of which website pages are visited by users are important research subjects in the field of privacy enhancing technologies, they are termed as website fingerprinting (WF) attacks and defenses. Nowadays, deep learning is an important tool in many research areas, including WF attacks and defenses. In this paper, we offer a comprehensive survey on deep learning for WF attacks and defenses. After a brief introduction, we first summarize deep learning, WF attacks, and WF defenses. For deep learning, we review the common paradigms, architectures, and performance metrics. For WF attacks, we review the approaches, challenges and solutions. The approaches include deep learning, traditional machine learning, and other methods. Challenges and solutions cover multi-tab browsing, concept drift, and the base rate fallacy. For WF defenses, we review the strategies and approaches. Then, we survey deep learning for WF attacks, and deep learning for WF defenses. In deep learning for WF attacks, we survey in detail the deep learning paradigms, architectures of WF attack models, and the performance of several representative WF attack models, and look into the future. In deep learning for WF defenses, we survey the architecture, efficacy and overhead of deep learning models in WF defenses, and look into the future. In the end, we summarize this paper.

  • Book Chapter
  • Cite Count Icon 1
  • 10.1007/978-981-19-5209-8_3
Effective and Lightweight Defenses Against Website Fingerprinting on Encrypted Traffic
  • Jan 1, 2022
  • Chengpu Jiang + 2 more

Recently, website fingerprinting (WF) attacks that eavesdrop on the web browsing activity of users by analyzing the observed traffic can endanger the data security of users even if the users have deployed encrypted proxies such as Tor. Several WF defenses have been raised to counter passive WF attacks. However, the existing defense methods have several significant drawbacks in terms of effectiveness and overhead, which means that these defenses rarely apply in the real world. The performance of the existing methods greatly depends on the number of dummy packets added, which increases overheads and hampers the user experience of web browsing activity.Inspired by the feature extraction of current WF attacks with deep learning networks, in this paper, we propose TED, a lightweight WF defense method that effectively decreases the accuracy of current WF attacks. We apply the idea of adversary examples, aiming to effectively disturb the accuracy of WF attacks with deep learning networks and precisely insert a few dummy packets. The defense extracts the key features of similar websites through a feature extraction network with adapted Grad-CAM and applies the features to interfere with the WF attacks. The key features of traces are utilized to generate defense fractions that are inserted into the targeted trace to deceive WF classifiers. The experiments are carried out on public datasets from DF. Compared with several WF defenses, the experiments show that TED can efficiently reduce the effectiveness of WF attacks with minimal expenditure, reducing the accuracy by nearly 40% with less than 30% overhead.KeywordsEncrypted trafficWebsite fingerprintingPrivacy

  • Conference Article
  • Cite Count Icon 172
  • 10.1145/3319535.3354217
Triplet Fingerprinting: More Practical and Portable Website Fingerprinting with N-shot Learning
  • Nov 6, 2019
  • Payap Sirinam + 3 more

Website Fingerprinting (WF) attacks pose a serious threat to users' online privacy, including for users of the Tor anonymity system. By exploiting recent advances in deep learning, WF attacks like Deep Fingerprinting (DF) have reached up to 98% accuracy. The DF attack, however, requires large amounts of training data that needs to be updated regularly, making it less practical for the weaker attacker model typically assumed in WF. Moreover, research on WF attacks has been criticized for not demonstrating attack effectiveness under more realistic and more challenging scenarios. Most research on WF attacks assumes that the testing and training data have similar distributions and are collected from the same type of network at about the same time. In this paper, we examine how an attacker could leverage N-shot learning---a machine learning technique requiring just a few training samples to identify a given class---to reduce the effort of gathering and training with a large WF dataset as well as mitigate the adverse effects of dealing with different network conditions. In particular, we propose a new WF attack called Triplet Fingerprinting (TF) that uses triplet networks for N-shot learning. We evaluate this attack in challenging settings such as where the training and testing data are collected multiple years apart on different networks, and we find that the TF attack remains effective in such settings with 85% accuracy or better. We also show that the TF attack is also effective in the open world and outperforms traditional transfer learning. On top of that, the attack requires only five examples to recognize a website, making it dangerous in a wide variety of scenarios where gathering and training on a complete dataset would be impractical.

  • Dissertation
  • 10.20381/ruor-20806
Towards an Evaluation of a Recommended Tor Browser Configuration in Light of Website Fingerprinting Attacks
  • Jan 1, 2017
  • Fayzah Alshammari

Website Fingerprinting (WF) attacks have become an area of concern for advocates of web Privacy Enhancing Technology (PET)s as they may allow a passive, local, eaves- dropper to eventually identify the accessed web page, endangering the protection offered by those PETs. Recent studies have demonstrated the effectiveness of those attacks through a number of experiments. However, some researchers in academia and Tor community demonstrated that the assumptions of WF attacks studies greatly simplify the problem and don’t reflect the evaluation of this vulnerability in practical scenarios. That leads to suspicion in the Tor community and among Tor Browser users about the efficacy of those attacks in real-world scenarios. In this thesis, we survey the literature of WF showing the research assumptions that have been made in the WF attacks against Tor. We then assess their practicality in real-world settings by evaluating their compliance to Tor Browser threat model, design requirements and to the Tor Project recommendations. Interestingly, we found one of the research assumptions related to the active content configuration in Tor Browser to be a reasonable assumption in all settings. Disabling or enabling the active content are both reasonable given the fact that the enabled configuration is the default of the Tor Browser, and the disabled one is the configuration recommended by Tor Project for users who require the highest possible security and anonymity. However, given the current published WF attacks, disabling the active con- tent is advantageous for the attacker as it makes the classification task easier by reducing the level of a web page randomness. To evaluate Tor Browser security in our proposed more realistic threat model, we collect a sample of censored dynamic web pages with Tor Browser in the default setting, which enables active content such as Javascript, and in the recommended setting by the Tor Project which disables the active content. We use Panchenko Support Vector Machine (SVM) classifier to study the identifiability of this sample of web pages. For pages that are very dynamic, we achieve a recognition rate of 42% when JavaScript is disabled, compared to 35% when turned on. Our results show that the recommended ”more secure” setting for Tor Browser is actually more vulnerable to WF attacks than the default and non-recommended setting.

  • Conference Article
  • Cite Count Icon 1
  • 10.1109/trustcom53373.2021.00111
Attack versus Attack: Toward Adversarial Example Defend Website Fingerprinting Attack
  • Oct 1, 2021
  • Chengshang Hou + 3 more

Website Fingerprinting (WF) attack is a side channel attack against encrypted tunnels which infers network activities of encrypted tunnels users. WF attack has been successfully applied to the Tor network, which poses a huge threat to the privacy of Tor visitors. A lot of countermeasures are therefore proposed to defend against such attacks. However, the newest attack successfully undermined the existing defense leveraging deep learning technique. In this paper, we propose an defense named Attack to Attack (A2A) that leverages adversarial example to attack the attacker's classifier. A2A treats website fingerprinting model as a black box. In order to find effective adversarial examples for the attacker's model, A2A manipulates traffic iteratively according to the output of a substitute model which is an elaborate model intentionally learning a similar classification boundary with the attacker's model. We evaluate the effectiveness of A2A on a public tor traffic dataset and the newest WF attack. The experimental results show that the proposed method provides effective defense with a bandwidth overhead of 2.2%, which significantly outperforms the manually designed defense (typically has a bandwidth overhead of 31%).

  • Research Article
  • Cite Count Icon 1
  • 10.1016/j.jnca.2024.104023
SSBM: A spatially separated boxes-based multi-tab website fingerprinting model
  • Sep 12, 2024
  • Journal of Network and Computer Applications
  • Xueshu Hong + 4 more

SSBM: A spatially separated boxes-based multi-tab website fingerprinting model

  • Research Article
  • Cite Count Icon 6
  • 10.56553/popets-2023-0125
Data-Explainable Website Fingerprinting with Network Simulation
  • Oct 1, 2023
  • Proceedings on Privacy Enhancing Technologies
  • Rob Jansen + 1 more

Website fingerprinting (WF) attacks allow an adversary to associate a website with the encrypted traffic patterns produced when accessing it, thus threatening to destroy the client-server unlinkability promised by anonymous communication networks. Explainable WF is an open problem in which we need to improve our understanding of (1) the machine learning models used to conduct WF attacks; and (2) the WF datasets used as inputs to those models. This paper focuses on explainable datasets; that is, we develop an alternative to the standard practice of gathering low-quality WF datasets using synthetic browsers in large networks without controlling for natural network variability. In particular, we demonstrate how network simulation can be used to produce explainable WF datasets by leveraging the simulator's high degree of control over network operation. Through a detailed investigation of the effect of network variability on WF performance, we find that: (1) training and testing WF attacks in networks with distinct levels of congestion increases the false-positive rate by as much as 200%; (2) augmenting the WF attacks by training them across several networks with varying degrees of congestion decreases the false-positive rate by as much as 83%; and (3) WF classifiers trained on completely simulated data can achieve greater than 80% accuracy when applied to the real world.

  • Conference Article
  • Cite Count Icon 2
  • 10.1109/ccis53392.2021.9754529
Website Fingerprinting Attack Through Persistent Attack of Student
  • Nov 7, 2021
  • Zhenyan Zhu + 5 more

Illegal users usually use Tor to hide their malicious behavior for browsing website. Website fingerprinting (WF) attack can help local network administrator to prevent illegal behavior of anonymous users. Although a lot of researches have improved website fingerprinting attacks, they still cannot address the concept drift problem effectively. In this paper, we propose a novel WF attack framework, Persistent Attack of Student (PAS), by integrating self-training mechanism with advanced deep learning (DL) related WF attack. PAS can train new DL model by using concept drift dataset with pseudo label for alleviating concept drift issue. In addition, we present a new deep convolutional neural network (DCNN) attack with stable accuracy by using automatic and local feature extraction. Then, we evaluate PAS application with different advanced deep learning WF attacks for alleviating concept drift issue. The experimental results show that DCNN attack achieves 96.50%-98.88% accuracy with 0.7-0.8x time cost of DF attack in closed world of 95-900 monitored websites, and reaches 96.32% precision and 96.31% recall in open world of 400,000 unmonitored websites. The PAS attack framework with different deep learning methods achieves 87.56%-91.46% in concept drift dataset of 56 days for 200 monitored websites, which is 2.27% 2.36% better than each original deep learning attack. The experimental results demonstrate that PAS framework can help alleviate concept drift issue effectively and DCNN can perform WF attack with less time cost efficiently.

  • Research Article
  • Cite Count Icon 7
  • 10.1016/j.comcom.2022.06.028
A Website Fingerprint defense technology with low delay and controllable bandwidth
  • Jun 23, 2022
  • Computer Communications
  • Xueshu Hong + 4 more

A Website Fingerprint defense technology with low delay and controllable bandwidth

  • Conference Article
  • Cite Count Icon 5
  • 10.1109/globecom42002.2020.9322307
PST: A More Practical Adversarial Learning-Based Defense Against Website Fingerprinting
  • Dec 1, 2020
  • Minghao Jiang + 5 more

To prevent serious privacy leakage from website fingerprinting (WF) attacks, many traditional or adversarial WF defenses have been released. However, traditional WF defenses such as Walkie-Talkie (W-T) still generate patterns that might be captured by the deep learning (DL) based WF attacks, which are not effective. Adversarial perturbation based WF defenses better confuse WF attacks, but their requirements for the entire original traffic trace and perturbating any points including historical packets or cells of the network traffic are not practical. To deal with the effectiveness and practicality issues of existing defenses, we proposed a novel WF defense in this paper, called PST. Given a few past bursts of a trace as input, PST Predicts subsequent fuzzy bursts with a neural network, then Searches small but effective adversarial perturbation directions based on observed and predicted bursts, and finally Transfers the perturbation directions to the remaining bursts. Our experimental results over a public closed-world dataset demonstrate that PST can successfully break the network traffic pattern and achieve a high evasion rate of 87.6%, beating W-T by more than 31.59% at the same bandwidth overhead, with only observing 10 transferred bursts. Moreover, our defense adapts to WF attacks dynamically, which could be retrained or updated.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 3
  • 10.56553/popets-2023-0047
DeepSE-WF: Unified Security Estimation for Website Fingerprinting Defenses
  • Apr 1, 2023
  • Proceedings on Privacy Enhancing Technologies
  • Alexander Veicht + 2 more

Website fingerprinting (WF) attacks, usually conducted with the help of a machine learning-based classifier, enable a network eavesdropper to pinpoint which website a user is accessing through the inspection of traffic patterns. These attacks have been shown to succeed even when users browse the Internet through encrypted tunnels, e.g., through Tor or VPNs. To assess the security of new defenses against WF attacks, recent works have proposed feature-dependent theoretical frameworks that estimate the Bayes error of an adversary's features set or the mutual information leaked by manually-crafted features. Unfortunately, as WF attacks increasingly rely on deep learning and latent feature spaces, our experiments show that security estimations based on simpler (and less informative) manually-crafted features can no longer be trusted to assess the potential success of a WF adversary in defeating such defenses. In this work, we propose DeepSE-WF, a novel WF security estimation framework that leverages specialized kNN-based estimators to produce Bayes error and mutual information estimates from learned latent feature spaces, thus bridging the gap between current WF attacks and security estimation methods. Our evaluation reveals that DeepSE-WF produces tighter security estimates than previous frameworks, reducing the required computational resources to output security estimations by one order of magnitude.

  • Conference Article
  • Cite Count Icon 28
  • 10.1109/cscwd.2015.7230964
A novel Website Fingerprinting attack against multi-tab browsing behavior
  • May 1, 2015
  • Xiaodan Gu + 2 more

Website Fingerprinting (WF) attacks have posed a serious threat to users' privacy, which allow an adversary to infer the anonymous communication content by using traffic analysis. Recent studies have demonstrated the effectiveness of WF attacks through a large number of experiments. However, some researchers believe that the assumptions of WF attacks vastly simplify the problem and are critical in the practical scenarios. In this paper, we assess the threat model of WF and relax the assumptions about browsing behavior to improve the practical feasibility. To deal with the multi-tab browsing scenario, we propose a novel WF attack and identify webpages respectively. The main idea resides in the fact that the user visits the second page with a short delay after opening the first page due to the think time. We analyze the anonymous traffic transmitted in the delay and select fine-grained features to identify the first page. Furthermore, we exclude the first page's traffic and utilize coarse features to identify the second page. We deploy our attack in real word environment and the experiment lasts for two months. The Naive Bayes classifier is then applied on the collected datasets to classify the visited websites among 50 top ranked websites in Alexa. When the delay is set to 2 seconds, our attack can classify the first page with 75.9% accuracy, and the second page is 40.5%. The results show that the WF attack is still effective in the practical scenarios and we can't dismiss WF as a threat.

  • Research Article
  • Cite Count Icon 4
  • 10.1016/j.comnet.2022.109461
An efficient cross-domain few-shot website fingerprinting attack with Brownian distance covariance
  • Nov 17, 2022
  • Computer Networks
  • Hongcheng Zou + 4 more

An efficient cross-domain few-shot website fingerprinting attack with Brownian distance covariance

  • Conference Article
  • Cite Count Icon 2
  • 10.1109/nana51271.2020.00020
CPWF: Cross-Platform Website Fingerprinting Based on Multi-Similarity Loss
  • Dec 1, 2020
  • Shihao Wang + 4 more

Website Fingerprinting (WF) attacks aim to identify the website label of anonymous website traffic trace. WF attacks are suitable for anonymous network scenario because it only needs to analyze the information contained in the packet header without analyzing the content of the packet payload. WF attacks are important means to combat anonymous cyber-crimes. However, with anonymous networks are widely deployed on different devices such as PC and mobile devices, the sources of anonymous website traffic have become more diverse, and differences in devices will cause differences in anonymous website traffic. Existing WF attacks only use anonymous website traffic collected from the same device to train the classifier, and the performance will be significantly reduced while it is trying to identify the mixed anonymous website traffic generated by different devices. In this study, we propose cross-platform website fingerprinting (CPWF) attack based on multi-similarity loss. First, we use the multi-similarity loss to train a deep learning-based website fingerprint extraction model. The model is able to extract a feature set for anonymous website traffic classification, ignoring the differences caused by different devices, so that attackers can use the anonymous website traffic collected from the single terminal device to train the classifier, the classifier capable of identifying anonymous website traffic collected from all devices effectively.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 2
  • 10.1155/2022/7330465
SAD: Website Fingerprinting Defense Based on Adversarial Examples
  • Apr 7, 2022
  • Security and Communication Networks
  • Renzhi Tang + 3 more

Website fingerprinting (WF) attacks can infer website names from encrypted network traffic when the victim is browsing the website. Inherent defenses of anonymous communication systems such as The Onion Router(Tor) cannot compete with current WF attacks. The state-of-the-art attack based on deep learning can gain over 98% accuracy in Tor. Most of the defenses have excellent defensive capabilities, but it will bring a relatively high bandwidth overhead, which will seriously affect the user’s network experience. And some defense methods have less impact on the latest website fingerprinting attacks. Defense-based adversarial examples have excellent defense capabilities and low bandwidth overhead, but they need to get the complete website traffic to generate defense data, which is obviously impractical. In this article, based on adversarial examples, we propose segmented adversary defense (SAD) for deep learning-based WF attacks. In SAD, sequence data are divided into multiple segments to ensure that SAD is feasible in real scenarios. Then, the adversarial examples for each segment of data can be generated by SAD. Finally, dummy packets are inserted after each segment original data. We also found that setting different head rates, that is, end points for the segments, will get better results. Experimentally, our results show that SAD can effectively reduce the accuracy of WF attacks. The technique drops the accuracy of the state-of-the-art attack hardened from 96% to 3% while incurring only 40% bandwidth overhead. Compared with the existing proposed defense named Deep Fingerprinting Defender (DFD), the defense effect of SAD is better under the same bandwidth overhead.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.