Feature selection in text classification: Identifying spurious words with causal inference methods

Zixuan Zhao,Jinxuan Chen,Yang Li,Jiayun Song,Hengzhuang Li

doi:10.54254/2755-2721/6/20230334

Abstract

As has been scrutinized by many, non-causal model may contain spurious correlations that act like shortcuts during the prediction phase, undermining cross-domain accuracy. This can be caused by biased training data that contains spurious words with neutral meanings yet can induce the model to predict wrongly. Based on this assumption, we propose a series of methods to detect these spurious words before feeding the model with the training data. We used advanced causal inference methods which are arising novas in recent studies, such as propensity score matching and inverse propensity score weighting to facilitate the feature selection before training. We experimented with multiple approaches to estimate propensity scores and got profound improvements. We further experimented with BERT model to evaluate the effectiveness of feature selection and find that the model performance with in-domain and out-of-domain testing samples is boosted after we remove the spurious words detected by our methods in the training data.

Full Text