Advancing Italian biomedical information extraction with transformers-based models: Methodological insights and multicenter practical application

Claudio Crema,Tommaso Mario Buonocore,Silvia Fostinelli,Enea Parimbelli,Federico Verde,Cira Fundarò,Marina Manera,Matteo Cotta Ramusino,Marco Capelli,Alfredo Costa,Giuliano Binetti,Riccardo Bellazzi,Alberto Redolfi

doi:10.1016/j.jbi.2023.104557

Abstract

The introduction of computerized medical records in hospitals has reduced burdensome activities like manual writing and information fetching. However, the data contained in medical records are still far underutilized, primarily because extracting data from unstructured textual medical records takes time and effort. Information Extraction, a subfield of Natural Language Processing, can help clinical practitioners overcome this limitation by using automated text-mining pipelines. In this work, we created the first Italian neuropsychiatric Named Entity Recognition dataset, PsyNIT, and used it to develop a Transformers-based model. Moreover, we collected and leveraged three external independent datasets to implement an effective multicenter model, with overall F1-score 84.77 %, Precision 83.16 %, Recall 86.44 %. The lessons learned are: (i) the crucial role of a consistent annotation process and (ii) a fine-tuning strategy that combines classical methods with a “low-resource” approach. This allowed us to establish methodological guidelines that pave the way for Natural Language Processing studies in less-resourced languages.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Biomedical Informatics	Publication Date: Nov 25, 2023
Citations: 2	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

Advancing Italian biomedical information extraction with transformers-based models: Methodological insights and multicenter practical application

Abstract

Talk to us

Similar Papers

More From: Journal of Biomedical Informatics

Lead the way for us

Similar Papers

Enhancing deep neural networks with morphological information
Matej Klemen ... Marko Robnik-Šikonja
Natural Language Engineering | VOL. 29
Matej Klemen, et. al.Matej Klemen ... Marko Robnik-Šikonja
21 Feb 2022
Natural Language Engineering | VOL. 29

Natural Language Processing and Computational Linguistics
Junichi Tsujii
Computational Linguistics | VOL. -
Junichi TsujiiJunichi Tsujii
07 Dec 2021
Computational Linguistics | VOL. -

NLP-based platform as a service: a brief review
Sebastião Pais ... M Luqman Jamil
Journal of Big Data | VOL. 9
Sebastião Pais, et. al.Sebastião Pais ... M Luqman Jamil
28 Apr 2022
Journal of Big Data | VOL. 9

COMPARATIVE ANALYSIS OF MULTILINGUAL QA MODELS AND THEIR ADAPTATION TO THE KAZAKH LANGUAGE
Arailym Tleubayeva ... Aday Shomanov
Scientific Journal of Astana IT University | VOL. -
Arailym Tleubayeva, et. al.Arailym Tleubayeva ... Aday Shomanov
30 Sep 2024
Scientific Journal of Astana IT University | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Advancing Italian biomedical information extraction with transformers-based models: Methodological insights and multicenter practical application

Abstract

Talk to us

Similar Papers

More From: Journal of Biomedical Informatics