Abstract
Online forums have been shown to contain a wealth of useful information. With a few notable exceptions, such forums have not received much attention from the research community, unlike other online social media. Our goal here is to conduct an in-depth thread-centric analysis of online forums, focusing on security forums. We propose, RThread, a comprehensive unsupervised clustering approach with a powerful visualization component, which we provide as a publicly-accessible web-based tool. Our approach leverages 92 thread features that span three groups: (a) temporal, (b) behavioral, and (c) content related. We analyze data from 8 security forums with 400k posts over a span of 8 years. First, we find that many thread-centric properties follow a log-normal distribution, which is persistent across several forums and over time. Second, we show how our approach can identify clusters of threads with similar behavior, while our visualization component provides an easy way to spot the differences between these clusters. Finally, we show how our approach can spot surprising behaviors, including a cluster, whose threads are used for Search Engine Optimization. We see our approach and our publicly available platform as a building block towards understanding forum activity and extracting interesting information in an unsupervised way.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.