A clinical trials corpus annotated with UMLS entities to enhance the access to evidence-based medicine

Leonardo Campillos-Llanos,Adrián Capllonch-Carrión,Ana Valverde-Mateos,Antonio Moreno-Sandoval

doi:10.1186/s12911-021-01395-z

Leonardo Campillos-Llanos, Adrián Capllonch-Carrión + Show 2 more

Open Access

https://doi.org/10.1186/s12911-021-01395-z

Copy DOI

Abstract

BackgroundThe large volume of medical literature makes it difficult for healthcare professionals to keep abreast of the latest studies that support Evidence-Based Medicine. Natural language processing enhances the access to relevant information, and gold standard corpora are required to improve systems. To contribute with a new dataset for this domain, we collected the Clinical Trials for Evidence-Based Medicine in Spanish (CT-EBM-SP) corpus.MethodsWe annotated 1200 texts about clinical trials with entities from the Unified Medical Language System semantic groups: anatomy (ANAT), pharmacological and chemical substances (CHEM), pathologies (DISO), and lab tests, diagnostic or therapeutic procedures (PROC). We doubly annotated 10% of the corpus and measured inter-annotator agreement (IAA) using F-measure. As use case, we run medical entity recognition experiments with neural network models.ResultsThis resource contains 500 abstracts of journal articles about clinical trials and 700 announcements of trial protocols (292 173 tokens). We annotated 46 699 entities (13.98% are nested entities). Regarding IAA agreement, we obtained an average F-measure of 85.65% (±4.79, strict match) and 93.94% (±3.31, relaxed match). In the use case experiments, we achieved recognition results ranging from 80.28% (±00.99) to 86.74% (±00.19) of average F-measure.ConclusionsOur results show that this resource is adequate for experiments with state-of-the-art approaches to biomedical named entity recognition. It is freely distributed at: http://www.lllf.uam.es/ESP/nlpmedterm_en.html. The methods are generalizable to other languages with similar available sources.

Highlights

The large volume of medical literature makes it difficult for healthcare professionals to keep abreast of the latest studies that support Evidence-Based Medicine
The methods are generalizable to other languages with similar available sources
Use case To determine the validity of the Clinical Trials (CT)-Evidence-Based Medicine (EBM)-SP corpus and present a real use case, we report experiments using this resource in the context of a supervised named entity recognition (NER) task

Summary

Introduction

The large volume of medical literature makes it difficult for healthcare professionals to keep abreast of the latest studies that support Evidence-Based Medicine. Access to specific types of interventions could be faster if professionals could customize their search and restrict it to chosen semantic classes. This could help to infer relations between interventions that are potentially related or that achieve the desired outcome, which requires perusing a (frequently) large amount of evidence sources. Enriching these texts with semantics is a potential benefit to enhance the access to hidden information

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Medical Informatics and Decision Making	Publication Date: Feb 22, 2021
Citations: 28	License type: open-access

R Discovery Prime

R Discovery Prime

A clinical trials corpus annotated with UMLS entities to enhance the access to evidence-based medicine

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Informatics and Decision Making

Lead the way for us

Similar Papers

Translation of Evidence Into Clinical Practice.
Donald A Molony ... Joshua Samuels
Advances in chronic kidney disease | VOL. 23
Donald A Molony, et. al.Donald A Molony ... Joshua Samuels
01 Nov 2016
Advances in chronic kidney disease | VOL. 23

Principles from clinical trials relevant to clinical practice: Part II.
Robert M Califf ... David L Demets
Circulation | VOL. 106
Robert M Califf, et. al.Robert M Califf ... David L Demets
27 Aug 2002
Circulation | VOL. 106

Introduction
Sheldon Greenfield
The American Journal of Medicine | VOL. 120
Sheldon GreenfieldSheldon Greenfield
31 Mar 2007
The American Journal of Medicine | VOL. 120

Novel Therapies, High-Risk Pediatric Research, and the Prospect of Benefit: Learning from the Ethical Disagreements
Inmaculada De Melo-Martín ... Ronald G Crystal
Molecular Therapy | VOL. 20
Inmaculada De Melo-Martín, et. al.Inmaculada De Melo-Martín ... Ronald G Crystal
01 Jun 2012
Molecular Therapy | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A clinical trials corpus annotated with UMLS entities to enhance the access to evidence-based medicine

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Informatics and Decision Making