Benchmark for Evaluation of Danish Clinical Word Embeddings

Martin Sundahl Laursen,Rasmus Søgaard Hansen,Thiusius Rajeeth Savarimuthu,Jannik Skyttegaard Pedersen,Pernille Just Vinholt

doi:10.3384/nejlt.2000-1533.2023.4132

Benchmark for Evaluation of Danish Clinical Word Embeddings

Martin Sundahl Laursen, Rasmus Søgaard Hansen + Show 3 more

Open Access

https://doi.org/10.3384/nejlt.2000-1533.2023.4132

Copy DOI

Journal: Northern European Journal of Language Technology	Publication Date: Mar 1, 2023
Citations: 1	License type: CC BY 4.0

Affiliation: University of Southern Denmark, Odense University Hospital

#General Domain #Natural Language Processing + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

In natural language processing, benchmarks are used to track progress and identify useful models. Currently, no benchmark for Danish clinical word embeddings exists. This paper describes the development of a Danish benchmark for clinical word embeddings. The clinical benchmark consists of ten datasets: eight intrinsic and two extrinsic. Moreover, we evaluate word embeddings trained on text from the clinical domain, general practitioner domain and general domain on the established benchmark. All the intrinsic tasks of the benchmark are publicly available.

Full Text