Abstract

Semantic similarity measures play an important role in many natural language processing and information retrieval activities. It is highly challenging to measure semantic similarity with higher accuracy. A notable branch of semantic similarity evaluation based on information content (IC) is popular in this aspect. Intrinsic information content (IIC) models are another wing of IC based evaluation. Both IC based and IIC based approaches majorly handled similarity evaluation of nouns. Research related to semantic similarity assessment of verb pairs are rarely discussed. To bridge this gap, this work examines various IC based, IIC based approaches on verb pairs. A detailed discussion of the existing measures and their drawbacks are mentioned in this work. Strategies based on information content, length and depth of the concepts are discussed and tested on benchmark datasets. Existing intrinsic information content models are enhanced by addressing various issues like (a) dealing concepts with no path in WordNet and (b) handling the synonym sets of verb concepts. Measures based on path length, intrinsic information content, combined strategies and non-linear strategies for verb pairs are thoroughly inspected. This paper also presents novel strategies to understand novel aspects that are not addressed before. The strategies are experimented by generating the synonym sets of required parts-of-speech which proved very effective in improving the correlation with human judgment. Results on benchmark datasets specify that the proposed approaches for verb similarity will be a guiding factor for understanding the natural language processing tasks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call