Compressed models for co-reference resolution: enhancing efficiency with debiased word embeddings

Georgios Ioannides,Aishwarya Jadhav,Aditi Sharma,Samarth Navali,Alan W Black

doi:10.1038/s41598-023-45677-0

Georgios Ioannides, Aishwarya Jadhav + Show 3 more

Open Access

https://doi.org/10.1038/s41598-023-45677-0

Copy DOI

Abstract

This work presents a comprehensive approach to reduce bias in word embedding vectors and evaluate the impact on various Natural Language Processing (NLP) tasks. Two GloVe variations (840B and 50) are debiased by identifying the gender direction in the word embedding space and then removing or reducing the gender component from the embeddings of target words, while preserving useful semantic information. Their gender bias is assessed through the Word Embedding Association Test. The performance of co-reference resolution and text classification models trained on both original and debiased embeddings is evaluated in terms of accuracy. A compressed co-reference resolution model is examined to gauge the effectiveness of debiasing techniques on resource-efficient models. To the best of the authors’ knowledge, this is the first attempt to apply compression techniques to debiased models. By analyzing the context preservation of debiased embeddings using a Twitter misinformation dataset, this study contributes valuable insights into the practical implications of debiasing methods for real-world applications such as person profiling.

Full Text