Abstract

AbstractThis study examines the ability of a semantic space model to represent the meaning of noun compounds such as ‘information gathering’ or ‘heart disease.’ For a semantic space model to compute the meaning and the attributional similarity (or semantic relatedness) for unfamiliar noun compounds that do not occur in a corpus, the vector for a noun compound must be computed from the vectors of its constituent words using vector composition algorithms. Six composition algorithms (i.e., centroid, multiplication, circular convolution, predication, comparison, and dilation) are compared in terms of the quality of the computation of the attributional similarity for English and Japanese noun compounds. To evaluate the performance of the computation of the similarity, this study uses three tasks (i.e., related word ranking, similarity correlation, and semantic classification), and two types of semantic spaces (i.e., latent semantic analysis-based and positive pointwise mutual information-based spaces). The result of these tasks is that the dilation algorithm is generally most effective in computing the similarity of noun compounds, while the multiplication algorithm is best suited specifically for the positive pointwise mutual information-based space. In addition, the comparison algorithm works better for unfamiliar noun compounds that do not occur in the corpus. These findings indicate that in general a semantic space model, and in particular the dilation, multiplication, and comparison algorithms have sufficient ability to compute the attributional similarity for noun compounds.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call