Abstract

Pierpaolo Basile, Annalina Caputo, Tommaso Caselli, Pierluigi Cassotti, Rossella Varvara. Proceedings of the 2nd International Workshop on Computational Approaches to Historical Language Change 2021. 2021.

Highlights

  • Natural languages are de facto living entities always subject to change and evolution

  • The Natural Language Processing (NLP) community has developed an interest in historical linguistics, and in particular in the study of lexical semantics change (LSC)

  • It has scrutinised the robustness of the LSCs, detected by a common algorithm, across different corpora

Read more

Summary

Introduction

Natural languages are de facto living entities always subject to change and evolution. Distributional models are powerful, yet they suffer from some limitations, namely: (i) they require large amount of text; (ii) they are sensitive to the type of texts and the distribution (i.e., frequency) of the lexical items; and (iii) they tend to conflate different types of information and variables such as semantics, social and topical information This contribution investigates two strictly connected aspects: the reliability of LSC benchmark data and the sensitivity of a state-of-the-art approach for LSC, grounded on the distributional hypothesis, when changing the source corpus. The results of our work will help to shed light on systems’ robustness and stability by verifying whether methods tuned on one corpus can be directly applied to another

Methodology
Testing for Robustness and Independence
Models into the Wild
Conclusion and Future Work
Findings
B Cosine similarities
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call