This paper provides the first broad overview of the relation between different interpretation methods and human eye-movement behaviour across different tasks and architectures. The interpretation methods of neural networks provide the information the machine considers important, while the human eye-gaze has been believed to be a proxy of the human cognitive process. Thus, comparing them explains machine behaviour in terms of human behaviour, leading to improvement in machine performance through minimising their difference. We consider three types of natural language processing (NLP) tasks: sentiment analysis, relation classification and question answering, and four interpretation methods based on: simple gradient, integrated gradient, input-perturbation and attention, and three architectures: LSTM, CNN and Transformer. We leverage two corpora annotated with eye-gaze information: the Zuco dataset and the MQA-RC dataset. This research sets up two research questions. First, we investigate whether the saliency (importance) of input-words conform with those from human eye-gaze features. To this end, we compute a saliency distance (SD) between input words (by an interpretation method) and an eye-gaze feature. SD is defined as the KL-divergence between the saliency distribution over input words and an eye-gaze feature. We found that the SD scores vary depending on the combinations of tasks, interpretation methods and architectures. Second, we investigate whether the models with good saliency conformity to human eye-gaze behaviour have better prediction performances. To this end, we propose a novel evaluation device called “SD-performance curve” (SDPC) which represents the cumulative model performance against the SD scores. SDPC enables us to analyse the underlying phenomena that were overlooked using only the macroscopic metrics, such as average SD scores and rank correlations, that are typically used in the past studies. We observe that the impact of good saliency conformity between humans and machines on task performance varies among the combinations of tasks, interpretation methods and architectures. Our findings should be considered when introducing eye-gaze information for model training to improve the model performance.
Read full abstract