Abstract
Without relying on any user reports, we use algorithmic detection and Bayesian statistical methods to analyse two large data streams (329 k users) of Reddit content to study the correlations between username toxicity (of various types, such as offensive or sexually explicit) and their online toxic behavior (personal attacks, sexual harassment among others).As it turns out, username toxicity (type) is a useful predictor in online profiling. Users with toxic usernames produce more toxic content than their neutral counterparts, with the difference in predicted mean increasing with activity (predicted 1.9 vs. 1.4 toxic comments a week for users with regular activity, and 5.6 vs. 4 for top 5% active users). More users with toxic usernames engage in toxic behavior than among neutral usernames (around 40% vs. 30%). They are also around 2.2 times more likely to have their account suspended by moderators (3.2% vs. 1.5% probability of suspension for regular and 4.5% vs. 2% for top 5% users)—detailed results vary depending on the username toxicity type and toxic behavior type. Thus, username toxicity can be used in the efforts of online communities to predict toxic behavior and to provide more safety to their users.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.