Abstract

Conceptualization is to obtain the most appropriate concepts for noun terms (entities) under different contexts, which plays an important role in human knowledge understanding. However, in natural language, entities are often ambiguous, which creates difficulties in conceptualization. To accurately conceptualize, we must eliminate the ambiguity of entities. Existing methods mainly rely on similar or related entities in context for disambiguation. However, due to the sparsity of user-generated short texts, the number of entities that can be extracted from them is limited. In this paper, we propose an entity disambiguation method, which consists of three steps. (1) Measuring the correlation between terms, which uses both corpus and knowledge information to capture the specific semantic relationship. (2) Selecting informative terms, which considers various types of contextual terms, not just entities, thereby mitigating the effects of text sparsity. (3) Prioritizing informative terms to highlight their discriminative power, which reduces noise interference. Finally, the target entity is disambiguated based on informative terms. Experimental results on ground-truth datasets demonstrate that the proposed method outperforms baseline methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call