A comparative analysis of Spanish Clinical encoder-based models on NER and classification tasks.

Guillem García Subies,Álvaro Barbero Jiménez,Paloma Martínez Fernández

doi:10.1093/jamia/ocae054

Abstract

This comparative analysis aims to assess the efficacy of encoder Language Models for clinical tasks in the Spanish language. The primary goal is to identify the most effective resources within this context. This study highlights a critical gap in NLP resources for the Spanish language, particularly in the clinical sector. Given the vast number of Spanish speakers globally and the increasing reliance on electronic health records, developing effective Spanish language models is crucial for both clinical research and healthcare delivery. Our work underscores the urgent need for specialized encoder models in Spanish that can handle clinical data with high accuracy, thus paving the way for advancements in healthcare services and biomedical research for Spanish-speaking populations. We examined 17 distinct corpora with a focus on clinical tasks. Our evaluation centered on Spanish Language Models and Spanish Clinical Language models (both encoder-based). To ascertain performance, we meticulously benchmarked these models across a curated subset of the corpora. This extensive study involved fine-tuning over 3000 models. Our analysis revealed that the best models are not clinical models, but general-purpose models. Also, the biggest models are not always the best ones. The best-performing model, RigoBERTa 2, obtained an average F1 score of 0.880 across all tasks. Our study demonstrates the advantages of dedicated encoder-based Spanish Clinical Language models over generative models. However, the scarcity of diverse corpora, mostly focused on NER tasks, underscores the need for further research. The limited availability of high-performing models emphasizes the urgency for development in this area. Through systematic evaluation, we identified the current landscape of encoder Language Models for clinical tasks in the Spanish language. While challenges remain, the availability of curated corpora and models offers a foundation for advancing Spanish Clinical Language models. Future efforts in refining these models are essential to elevate their effectiveness in clinical NLP.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A comparative analysis of Spanish Clinical encoder-based models on NER and classification tasks.

Abstract

Talk to us

Similar Papers

More From: Journal of the American Medical Informatics Association : JAMIA

Lead the way for us

Journal: Journal of the American Medical Informatics Association : JAMIA	Publication Date: Mar 15, 2024
Citations: 1

Similar Papers

Spanish language proficiency in dual language and English as a second language models: the impact of model, time, teacher, and student on Spanish language development
Trish Morita-Mullaney ... Ming Ming Chiu
International Journal of Bilingual Education and Bilingualism | VOL. 25
Trish Morita-Mullaney, et. al.Trish Morita-Mullaney ... Ming Ming Chiu
28 Jun 2022
International Journal of Bilingual Education and Bilingualism | VOL. 25

Pretrained Language Models for Biomedical and Clinical Tasks: Understanding and Extending the State-of-the-Art
Patrick Lewis ... Jingfei Du
-
Patrick Lewis, et. al.Patrick Lewis ... Jingfei Du
01 Jan 2020
01 Jan 2020

Instruments for Measuring Patient Satisfaction with Pharmacy Services in the Spanish Language
María Luz Traverso ... Linda D Mackeigan
Pharmacy World & Science | VOL. 27
María Luz Traverso, et. al.María Luz Traverso ... Linda D Mackeigan
01 Aug 2005
Pharmacy World & Science | VOL. 27

Clinical Flair: A Pre-Trained Language Model for Spanish Clinical Natural Language Processing

-

09 Jul 2022
09 Jul 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A comparative analysis of Spanish Clinical encoder-based models on NER and classification tasks.

Abstract

Talk to us

Similar Papers

More From: Journal of the American Medical Informatics Association : JAMIA