Abstract
In this paper, we define and apply representational stability analysis (ReStA), an intuitive way of analyzing neural language models. ReStA is a variant of the popular representational similarity analysis (RSA) in cognitive neuroscience. While RSA can be used to compare representations in models, model components, and human brains, ReStA compares instances of the same model, while systematically varying single model parameter. Using ReStA, we study four recent and successful neural language models, and evaluate how sensitive their internal representations are to the amount of prior context. Using RSA, we perform a systematic study of how similar the representational spaces in the first and second (or higher) layers of these models are to each other and to patterns of activation in the human brain. Our results reveal surprisingly strong differences between language models, and give insights into where the deep linguistic processing, that integrates information over multiple sentences, is happening in these models. The combination of ReStA and RSA on models and brains allows us to start addressing the important question of what kind of linguistic processes we can hope to observe in fMRI brain imaging data. In particular, our results suggest that the data on story reading from Wehbe et al./ (2014) contains a signal of shallow linguistic processing, but show no evidence on the more interesting deep linguistic processing.
Highlights
Representational similarity analysis (RSA) is a technique which allows us to compare heterogeneous representational spaces (Laakso and Cottrell, 2000)
We present a large-scale study and comparison of both neural language models and fMRI data from brain imaging experiments with human subjects, using RSA
Despite the low correlations between the models and the brain activation, we find that all the models are consistently best aligned with the regions in the Left Anterior Temporal Lobe (LATL)
Summary
An important motivation behind our work is to contribute to answering a big question in computational linguistics: how do we establish a relationship between NLP models and data on the human brain activation while they process language? Jain and Huth (2018) report that the higher layers of the LSTM are better at predicting the activation of brain regions that are known for higher level language functions (a finding seemingly at odds with results from section). Jain and Huth (2018) report that the higher layers of the LSTM are better at predicting the activation of brain regions that are known for higher level language functions (a finding seemingly at odds with results from section5) In this effort, we run into a number of major conceptual, methodological and technical challenges. We explain the language encoding models we study in our experiments and the dataset from which we get the language stimuli and their corresponding brain data
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.