Better entity matching with transformers through ensembles

Jwen Fai Low,Benjamin C.M. Fung,Pulei Xiong

doi:10.1016/j.knosys.2024.111678

Abstract

In this paper, we introduce AttendEM, a framework for entity matching (EM), i.e., pairwise identification of duplicates across databases. Eschewing the prevalent focus on text cleaning and training data augmentation of other transformers-based EM solutions, AttendEM leverages intra-transformer ensembling of distinctively rearranged text, additional aggregator tokens, and extra self-attention to enhance the base transformer architecture. Against state-of-the-art (SOTA) solutions on the ER-Magellan benchmark datasets, AttendEM achieved higher F1 scores in most cases. These SOTA solutions are Ditto (mean improvement of 0.21% with Ditto’s own reported results, 3.93% with DAEM’s Ditto replication, 2.99% with HierGAT’s Ditto replication), DAEM (0.53%), and HierGAT (0.54%). AttendEM’s improvements are comparable to solutions that claimed to have outperformed Ditto, HierGAT (Yao et al., 2022) (2.46% compared to AttendEM’s 2.99%) and DAEM (Huang et al., 2022) (3.42% compared to AttendEM’s 3.93%), when calculated using results from their respective Ditto replications.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Better entity matching with transformers through ensembles

Abstract

Talk to us

Similar Papers

More From: Knowledge-Based Systems

Lead the way for us

Journal: Knowledge-Based Systems	Publication Date: Mar 22, 2024
Citations: 1

Similar Papers

Deep entity matching with pre-trained language models
Yuliang Li ... Jinfeng Li
Proceedings of the VLDB Endowment | VOL. 14
Yuliang Li, et. al.Yuliang Li ... Jinfeng Li
01 Sep 2020
Proceedings of the VLDB Endowment | VOL. 14

JointMatcher: Numerically-aware entity matching using pre-trained language models with attention concentration
Chen Ye ... Guojun Dai
Knowledge-Based Systems | VOL. 251
Chen Ye, et. al.Chen Ye ... Guojun Dai
16 May 2022
Knowledge-Based Systems | VOL. 251

Schema-Agnostic Entity Matching using Pre-trained Language Models
Kai-Sheng Teong ... Tin Tin Su
-
Kai-Sheng Teong, et. al.Kai-Sheng Teong ... Tin Tin Su
19 Oct 2020
19 Oct 2020

Tailoring Entity Matching for Industrial Settings
Nils Barlaug
-
Nils BarlaugNils Barlaug
19 Oct 2020
19 Oct 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Better entity matching with transformers through ensembles

Abstract

Talk to us

Similar Papers

More From: Knowledge-Based Systems