Comparing Heuristic Rules and Masked Language Models for Entity Alignment in the Literature Domain

Dominique Piché,Ludovic Font,Michel Gagnon,Amal Zouaq

doi:10.1145/3606699

Abstract

The cultural world offers a staggering amount of rich and varied metadata on cultural heritage, accumulated by governmental, academic, and commercial players. However, the variety of involved institutions means that the data are stored in as many complex and often incompatible models and standards, which limits its availability and explorability by the greater public. The adoption of Linked Open Data technologies allows a strong interlinking of these various databases as well as external connections with existing knowledge bases. However, as they often contain references to the same entities, the delicate issue of entity alignment becomes the central challenge, especially in the absence or scarcity of unique global identifiers. To tackle this issue, we explored two approaches, one based on a set of heuristic rules and one based on masked language models, or masked language models (MLMs). We compare these two approaches, as well as different variations of MLMs, including some models trained on a different language, and various levels of data cleaning and labeling. Our results show that heuristics are a solid approach but also that MLM-based entity alignment obtains better performance coupled with the fact that it is robust to the data format and does not require any form of data preprocessing, which was not the case of the heuristic approach in our experiments.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Comparing Heuristic Rules and Masked Language Models for Entity Alignment in the Literature Domain

Abstract

Talk to us

Similar Papers

More From: Journal on Computing and Cultural Heritage

Lead the way for us

Similar Papers

An integrated pipeline model for biomedical entity alignment
Yu Hu ... Derong Shen
Frontiers of Computer Science | VOL. 15
Yu Hu, et. al.Yu Hu ... Derong Shen
16 Jan 2021
Frontiers of Computer Science | VOL. 15

Entity alignment for temporal knowledge graphs via adaptive graph networks
Jia Li ... Yanru Zhou
Knowledge-Based Systems | VOL. 274
Jia Li, et. al.Jia Li ... Yanru Zhou
18 May 2023
Knowledge-Based Systems | VOL. 274

Ensemble Semi-supervised Entity Alignment via Cycle-Teaching
Kexuan Xin ... Wei Hu
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 36
Kexuan Xin, et. al.Kexuan Xin ... Wei Hu
28 Jun 2022
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 36

Conflict-Aware Pseudo Labeling via Optimal Transport for Entity Alignment
Qijie Ding ... Jie Yin
-
Qijie Ding, et. al.Qijie Ding ... Jie Yin
01 Nov 2022
01 Nov 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comparing Heuristic Rules and Masked Language Models for Entity Alignment in the Literature Domain

Abstract

Talk to us

Similar Papers

More From: Journal on Computing and Cultural Heritage