Constructing and validating word similarity datasets by integrating methods from psychology, brain science and computational linguistics

Yu Wan,Changle Zhou,Yidong Chen,Xiaodong Shi

doi:10.1007/s00500-018-3174-1

Abstract

Human-scored word similarity gold-standard datasets are normally composed of word pairs with corresponding similarity scores. These datasets are popular resources for evaluating word similarity models which are the essential components for many natural language processing tasks. This paper proposes a novel multidisciplinary method for constructing and validating word similarity gold-standard datasets. The proposed method is different from the previous ones in that it introduces methods from three different disciplines, i.e., psychology, brain science and computational linguistics to validate the soundness of the constructed datasets. Specifically, to the best of our knowledge, this is the first time event-related potentials experiments are incorporated to validate the word similarity datasets. Using the proposed method, we finally constructed a Chinese gold-standard word similarity dataset with 260 word pairs and showed its soundness using the interdisciplinary validating methods. It should be noted that, although the paper only focused on constructing Chinese standard dataset, the proposed method is applicable to other languages.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Constructing and validating word similarity datasets by integrating methods from psychology, brain science and computational linguistics

Abstract

Talk to us

Similar Papers

More From: Soft Computing

Lead the way for us

Journal: Soft Computing	Publication Date: Apr 3, 2018
Citations: 2

Similar Papers

A Multidisciplinary Method for Constructing and Validating Word Similarity Datasets
Yu Wan ... Xiaodong Shi
-
Yu Wan, et. al.Yu Wan ... Xiaodong Shi
05 Sep 2017
05 Sep 2017

Overview of the NLPCC-ICCPOL 2016 Shared Task: Chinese Word Similarity Measurement
Yunfang Wu ... Wei Li
-
Yunfang Wu, et. al.Yunfang Wu ... Wei Li
01 Jan 2015
Overview of the NLPCC-ICCPOL 2016 Shared Task: Chinese Word Similarity Measurement
Yunfang Wu ... Wei Li

Natural Language Processing and Computational Linguistics
Junichi Tsujii
Computational Linguistics | VOL. -
Junichi TsujiiJunichi Tsujii
07 Dec 2021
Computational Linguistics | VOL. -

Combining Large-Scale Unlabeled Corpus and Lexicon for Chinese Polysemous Word Similarity Computation
Huiwei Zhou ... Degen Huang
-
Huiwei Zhou, et. al.Huiwei Zhou ... Degen Huang
01 Jan 2017
01 Jan 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Constructing and validating word similarity datasets by integrating methods from psychology, brain science and computational linguistics

Abstract

Talk to us

Similar Papers

More From: Soft Computing