Since its inception, The Onion Router (TOR) has been discussed as an anonymizing tool used for nefarious purposes. Past scholarship has focused on publicly available lists of onion URLs containing illicit or illegal content. The current study is an attempt to move past these surface-level explanations and into a discussion of actual use data; a multi-tiered system to identify real-world TOR traffic was developed for the task. The researcher configured and deployed a fully functioning TOR “exit” node for public use. A Wireshark instance was placed between the node and the “naked” internet to collect usage data (destination URLs, length of visit, etc.), but not to deanonymize or otherwise unmask TOR users. For 6 months, the node ran and collected data 24 hr per day, which produced a data set of over 4.5 terabytes. Using Python, the researcher developed a custom tool to filter the URLs into human-readable form and to produce descriptive data. All URLs were coded and categorized into a variety of classifications, including e-commerce, banking, social networking, pornography, and cryptocurrency. Findings reveal that most TOR usage is rather benign, with users spending much more time on social networking and e-commerce sites than on those with illegal drug or pornographic content. Likewise, visits to legal sites vastly outnumber visits to illegal ones. Although most URLs collected were for English-language websites, there were a sizable amount for Russian and Chinese sites, which may demonstrate the utilization of TOR in countries where internet access is censored or monitored by government actors. Akin to other new technologies which have earned bad reputations, such as file-sharing program BitTorrent and intellectual property theft or cryptocurrency Bitcoin and online drug sales, this study demonstrates that TOR is utilized by offenders and non-offenders alike.
Read full abstract