Abstract
THIS ARTICLE USES WORDS OR LANGUAGE THAT IS CONSIDERED PROFANE, VULGAR, OR OFFENSIVE BY SOME READERS. The presence of a significant amount of harassment in user-generated content and its negative impact calls for robust automatic detection approaches. This requires the identification of different types of harassment. Earlier work has classified harassing language in terms of hurtfulness, abusiveness, sentiment, and profanity. However, to identify and understand harassment more accurately, it is essential to determine the contextual type that captures the interrelated conditions in which harassing language occurs. In this paper we introduce the notion of contextual type in harassment by distinguishing between five contextual types: (i) sexual, (ii) racial, (iii) appearance-related, (iv) intellectual and (v) political. We utilize an annotated corpus from Twitter distinguishing these types of harassment. We study the context of each kind to shed light on the linguistic meaning, interpretation, and distribution, with results from two lines of investigation: an extensive linguistic analysis, and the statistical distribution of uni-grams. We then build type- aware classifiers to automate the identification of type-specific harassment. Our experiments demonstrate that these classifiers provide competitive accuracy for identifying and analyzing harassment on social media. We present extensive discussion and significant observations about the effectiveness of type-aware classifiers using a detailed comparison setup, providing insight into the role of type-dependent features.
Highlights
IntroductionSocial media has enabled people to connect and interact with each other, it has made people vulnerable to insults, humiliation, hate, bullying–facing threats from
We provide a systematic, and comparative analysis to assess offensive language from linguistic and statistical perspectives for each contextual type. This allows us to exploit relevant features for developing classifiers to identify these critical types of harassment on social media. (ii) We develop type-aware classifiers and capture their effectiveness using a detailed comparative study
Collected discriminating terms for hate speech and offensive language Determined that posts of aggressor profiles are more negative (i) Conceptualized offensive content, and (ii) enhanced features using lexical, style, structural, and context-specific features
Summary
Social media has enabled people to connect and interact with each other, it has made people vulnerable to insults, humiliation, hate, bullying–facing threats from. Analyzing and learning harassment language findings, and conclusions, recommendations expressed in this material are those of the author (s) and do not necessarily reflect the views of the NSF. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.