Abstract

The bias in machine learning models has gained increasing attention in recent years, as these models can reflect and even amplify biases present in the data used to train them. One approach to mitigating bias is identifying and down-weight features that contribute disproportionately to model predictions, which can be accomplished using saliency techniques. Current debiasing methods often lead to the loss of contextual information, where the model tends to respond incorrectly even when the gender information is present in the context; hence, even though the bias reduces, performance (coreference resolution, fluency) also reduces. This paper explores data augmentation and saliency techniques to mitigate bias in natural language generation. Specifically, we explore applying the saliency technique of SHAP (SHapley Additive exPlanations) over a model trained on debiasing using data augmentation (switching gendered words with counterparts) and then applying hard debiasing to remove the influential biased token. We build a dialogue context test setup to evaluate bias and context relevance using the presence of gendered words in the model-generated responses. The response is evaluated based on the gender information from context to ensure the model follows the gender in context. We demonstrate that this approach can effectively reduce the impact of biased features on model predictions while preserving overall model accuracy. Additionally, we discuss potential limitations and future directions for research in this area. Our findings suggest that saliency offers an avenue to address machine learning bias.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.