SexWEs: Domain-Aware Word Embeddings via Cross-Lingual Semantic Specialisation for Chinese Sexism Detection in Social Media

Aiqi Jiang,Arkaitz Zubiaga

doi:10.1609/icwsm.v17i1.22159

Abstract

The goal of sexism detection is to mitigate negative online content targeting certain gender groups of people. However, the limited availability of labeled sexism-related datasets makes it problematic to identify online sexism for low-resource languages. In this paper, we address the task of automatic sexism detection in social media for one low-resource language -- Chinese. Rather than collecting new sexism data or building cross-lingual transfer learning models, we develop a cross-lingual domain-aware semantic specialisation system in order to make the most of existing data. Semantic specialisation is a technique for retrofitting pre-trained distributional word vectors by integrating external linguistic knowledge (such as lexico-semantic relations) into the specialised feature space. To do this, we leverage semantic resources for sexism from a high-resource language (English) to specialise pre-trained word vectors in the target language (Chinese) to inject domain knowledge. We demonstrate the benefit of our sexist word embeddings (SexWEs) specialised by our framework via intrinsic evaluation of word similarity and extrinsic evaluation of sexism detection. Compared with other specialisation approaches and Chinese baseline word vectors, our SexWEs shows an average score improvement of 0.033 and 0.064 in both intrinsic and extrinsic evaluations, respectively. The ablative results and visualisation of SexWEs also prove the effectiveness of our framework on retrofitting word vectors in low-resource languages.

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

SexWEs: Domain-Aware Word Embeddings via Cross-Lingual Semantic Specialisation for Chinese Sexism Detection in Social Media

Abstract

Published Version (Free)

Talk to us

Similar Papers

More From: Proceedings of the International AAAI Conference on Web and Social Media

Lead the way for us

Journal: Proceedings of the International AAAI Conference on Web and Social Media	Publication Date: Jun 2, 2023
Citations: 2

Similar Papers

Knowledge Transfer from High-Resource to Low-Resource Programming Languages for Code LLMs
Federico Cassano ... Carolyn Jane Anderson
Proceedings of the ACM on Programming Languages | VOL. 8
Federico Cassano, et. al.Federico Cassano ... Carolyn Jane Anderson
08 Oct 2024
Proceedings of the ACM on Programming Languages | VOL. 8

Adversarial Propagation and Zero-Shot Cross-Lingual Transfer of Word Vector Specialization
Edoardo Maria Ponti ... Ivan Vulić
-
Edoardo Maria Ponti, et. al.Edoardo Maria Ponti ... Ivan Vulić
01 Jan 2018
01 Jan 2018

Transfer Learning, Style Control, and Speaker Reconstruction Loss for Zero-Shot Multilingual Multi-Speaker Text-to-Speech on Low-Resource Languages
Kurniawati Azizah ... Wisnu Jatmiko
IEEE Access | VOL. 10
Kurniawati Azizah, et. al.Kurniawati Azizah ... Wisnu Jatmiko
01 Jan 2021
IEEE Access | VOL. 10

Wasserstein Cross-Lingual Alignment For Named Entity Recognition
Rui Wang ... Ricardo Henao
-
Rui Wang, et. al.Rui Wang ... Ricardo Henao
23 May 2022
23 May 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

SexWEs: Domain-Aware Word Embeddings via Cross-Lingual Semantic Specialisation for Chinese Sexism Detection in Social Media

Abstract

Published Version (Free)

Talk to us

Similar Papers

More From: Proceedings of the International AAAI Conference on Web and Social Media