Abstract
Misogyny is a severe social problem that affects women’s mental and physical health or even leads to femicide. This cultural problem is visible and prevalent in different communication channels, such as music and social media, confirming or inciting this behavior. Hence, the automatic detection of misogynistic content in social media using computational methods that analyze the language is of increasing interest. Most approaches follow a supervised machine learning strategy, with the main challenge of capturing the diversity and complexity of the offensive language directed at women. Therefore, the size and quality of the training data play essential roles. In this context, we designed a novel data augmentation approach that leverages song phrases to increase the models’ ability to generalize and improve their performance. In addition, this paper introduces a methodology to compile a labeled dataset with song segments conveying misogyny, which can be used to enrich different techniques in this field. The proposed approach was evaluated using English and Spanish benchmark datasets. It successfully overcomes conventional transfer learning techniques and achieves high competitiveness compared with state-of-the-art methods, outperforming them on the Spanish dataset. These encouraging results demonstrate the usefulness of the proposed approach.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.