Abstract

Prior work decoding linguistic meaning from imaging data has been largely limited to concrete nouns, using similar stimuli for training and testing, from a relatively small number of semantic categories. Here we present a new approach for building a brain decoding system in which words and sentences are represented as vectors in a semantic space constructed from massive text corpora. By efficiently sampling this space to select training stimuli shown to subjects, we maximize the ability to generalize to new meanings from limited imaging data. To validate this approach, we train the system on imaging data of individual concepts, and show it can decode semantic vector representations from imaging data of sentences about a wide variety of both concrete and abstract topics from two separate datasets. These decoded representations are sufficiently detailed to distinguish even semantically similar sentences, and to capture the similarity structure of meaning relationships between sentences.

Highlights

  • Prior work decoding linguistic meaning from imaging data has been largely limited to concrete nouns, using similar stimuli for training and testing, from a relatively small number of semantic categories

  • A word is represented as a semantic vector in a high-dimensional space, where similarity between two word vectors reflects similarity of the contexts in which those words appear in the language[2]

  • This was motivated by previous studies that showed that the patterns of activation for semantically related stimuli were more similar to each other than for unrelated stimuli[16,19].The decoder used this relationship to infer the degree to which each dimension was present in new activation patterns collected from the same participant, and to output semantic vectors representing their contents

Read more

Summary

Introduction

Prior work decoding linguistic meaning from imaging data has been largely limited to concrete nouns, using similar stimuli for training and testing, from a relatively small number of semantic categories. We present a new approach for building a brain decoding system in which words and sentences are represented as vectors in a semantic space constructed from massive text corpora By efficiently sampling this space to select training stimuli shown to subjects, we maximize the ability to generalize to new meanings from limited imaging data. These models have been extended beyond single words to express meanings of phrases and sentences[5,6,7], and the resulting representations predict human similarity judgments for phraseand sentence-level paraphrases[8,9] To test whether these distributed representations of meaning are neurally plausible, a number of studies have attempted to learn a mapping between particular semantic dimensions and patterns of brain activation If this relationship can be learned, and if our training set covers all the dimensions of the semantic space, any meaning that can be represented by a semantic vector can, in principle, be decoded

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.