Abstract

Linking ambiguous entity mentions in a text with their true mapping entities in a heterogeneous information network (HIN) is important. Most of existing entity linking methods with HINs assume that the entities in a text are independent while ignoring the relationships between the entities in context. Recent studies have shown that collective entity linking methods are more effective than traditional independent entity linking methods because they consider the relationships between different entities in the same text. However, few studies focus on collective entity linking for HINs. Most of collective entity linking methods rely largely on special features in Wikipedia, and may not be suitable for the HINs that are not mapped to Wikipedia. Moreover, existing collective entity linking methods may have high time complexity. Therefore, a Coarse-to-Fine collective Entity Linking algorithm (called CFEL) is proposed for the case the Wikipedia cannot be used. CFEL is composed of a coarse-grained model and a fine-grained model. In the coarse-grained model, a pruning strategy motivated by the human cognition mechanism, is adopted to reduce the number of candidates for each entity mention in texts. The candidates in HINs that are inconsistent with the type of entity mentions can be deleted. In the fine-grained model, we present a probabilistic method that combines the semantic information in a text with the structural information in HINs. The experimental results on four real-world datasets verify the effectiveness of our algorithm compared to the baselines.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.