Abstract

With increasing availability of digital text, there has been an explosion of computational methods designed to turn patterns of word co-occurrence in large text corpora into numerical scores expressing the “semantic distance” between any two words. The success of such methods is typically evaluated by how well they predict human judgments of similarity. Here, I examine how well corpus-based methods predict amplitude of the N400 component of the event-related potential (ERP), an online measure of lexical processing in brain electrical activity. ERPs elicited by the second words of 303 word pairs were analyzed at the level of individual items. Three corpus-based measures (mutual information, distributional similarity, and latent semantic analysis) were compared to a traditional measure of free association strength. In a regression analysis, corpus-based and free association measures each explained some of the variance in N400 amplitude, suggesting that these may tap distinct aspects of word relationships. Lexical factors of concreteness of word meaning, word frequency, number of semantic associates, and orthographic similarity also explained variance in N400 amplitude at the single-item level.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.