Abstract

Coreference resolution has been mostly investigated within a single document scope, showing impressive progress in recent years based on end-to-end models. However, the more challenging task of cross-document (CD) coreference resolution remained relatively under-explored, with the few recent models applied only to gold mentions. Here, we introduce the first end-to-end model for CD coreference resolution from raw text, which extends the prominent model for within-document coreference to the CD setting. Our model achieves competitive results for event and entity coreference resolution on gold mentions. More importantly, we set first baseline results, on the standard ECB+ dataset, for CD coreference resolution over predicted mentions. Further, our model is simpler and more efficient than recent CD coreference resolution systems, while not using any external resources.

Highlights

  • Cross-document (CD) coreference resolution consists of identifying textual mentions across multiple documents that refer to the same concept

  • We address the inherently non-linear nature of the CD setting by combining the WD coreference model with agglomerative clustering that was shown useful in CD models

  • While WD coreference systems typically disregard singletons when evaluating on raw text, CD coreference models do consider singletons when evaluating on gold mentions on ECB+

Read more

Summary

Introduction

Cross-document (CD) coreference resolution consists of identifying textual mentions across multiple documents that refer to the same concept. State-of-the-art models exhibit several shortcomings, such as operating on gold mentions or relying on external resources such as SRL or a paraphrase dataset (Shwartz et al, 2017), preventing them from being applied on realistic settings. To address these limitations, we develop the first end-to-end CD coreference model building upon a prominent within-document (WD) coreference model (Lee et al, 2017) which we extend with recent advances in transformer-based encoders. Based on this work, Meged et al (2020) improved results on event coreference by leverag-

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call