Abstract

Wikipedia provides a huge collaboratively made semi-structured taxonomy called Wikipedia category graph (WCG), which can be utilized as a Knowledge Graph (KG) to measure the semantic similarity (SS) between Wikipedia concepts. Previously, several Most Informative Common Ancestor-based (MICA-based) SS methods have been proposed by intrinsically manipulating the taxonomic structure of WCG. However, some basic structural issues in WCG such as huge size, branching factor and multiple inheritance relations hamper the applicability of traditional MICA-based and multiple inheritance-based approaches in it. Therefore, in this paper, we propose a solution to handle these structural issues and present a new multiple inheritance-based SS approach, called Neighborhood Ancestor Semantic Contribution (NASC). In this approach, firstly, we define the neighborhood of a category (a taxonomic concept in WCG) to define its semantic space. Secondly, we describe the semantic value of a category by aggregating the intrinsic IC-based semantic contribution weights of its semantically relevant multiple ancestors. Thirdly, based on our approach, we propose six different methods to compute the SS between Wikipedia concepts. Finally, we evaluate our methods on gold standard word similarity benchmarks for English, German, Spanish and French languages. The experimental evaluation demonstrates that the proposed NASC-based methods remarkably outperform traditional MICA-based and multiple inheritance-based approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call