Abstract

Hate speech is a persistent issue in social media. Researchers have analyzed and developed detection methods for hate speech on the basis of example data, even though the phenomenon is only rather vaguely defined. This paper provides an approach to identify hate speech in terms of German laws, which are used as a basis for annotation guidelines applied to real world data. We annotate six labels in a corpus of 1,385 German short text messages: four subcategories of illegal hate speech, offensive language and a neutral class. We consider hate speech expressions as illegal if the linguistic content could be interpreted in a given context possibly violating a specific law. This interpretation and a check by lawyers would be the next step which is not yet included in our annotation. In this paper, I report on strategies to avoid certain biases in data for illegal hate speech. These strategies may serve as a model for building a larger dataset. In experiments, I investigate the capability of a Transformer-based neural network model to learn our classification. The results show that this multiclass classification is still difficult to learn, probably due to the small size of the dataset. I suggest that it is crucial to be aware of data biases and to apply bias mitigation techniques when training hate speech detection systems on such data. Data and scripts of the experiments are made publicly available.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.