Abstract

BackgroundIn order to extract meaningful information from electronic medical records, such as signs and symptoms, diagnoses, and treatments, it is important to take into account the contextual properties of the identified information: negation, temporality, and experiencer. Most work on automatic identification of these contextual properties has been done on English clinical text. This study presents ContextD, an adaptation of the English ConText algorithm to the Dutch language, and a Dutch clinical corpus.We created a Dutch clinical corpus containing four types of anonymized clinical documents: entries from general practitioners, specialists’ letters, radiology reports, and discharge letters. Using a Dutch list of medical terms extracted from the Unified Medical Language System, we identified medical terms in the corpus with exact matching. The identified terms were annotated for negation, temporality, and experiencer properties. To adapt the ConText algorithm, we translated English trigger terms to Dutch and added several general and document specific enhancements, such as negation rules for general practitioners’ entries and a regular expression based temporality module.ResultsThe ContextD algorithm utilized 41 unique triggers to identify the contextual properties in the clinical corpus. For the negation property, the algorithm obtained an F-score from 87% to 93% for the different document types. For the experiencer property, the F-score was 99% to 100%. For the historical and hypothetical values of the temporality property, F-scores ranged from 26% to 54% and from 13% to 44%, respectively.ConclusionsThe ContextD showed good performance in identifying negation and experiencer property values across all Dutch clinical document types. Accurate identification of the temporality property proved to be difficult and requires further work. The anonymized and annotated Dutch clinical corpus can serve as a useful resource for further algorithm development.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-014-0373-3) contains supplementary material, which is available to authorized users.

Highlights

  • In order to extract meaningful information from electronic medical records, such as signs and symptoms, diagnoses, and treatments, it is important to take into account the contextual properties of the identified information: negation, temporality, and experiencer

  • The ConText algorithm [13] is based on the NegEx algorithm and apart from identifying negations it identifies whether a clinical condition is present, historical, or hypothetical, and whether the clinical condition is experienced by the patient or someone else, e.g., a family member

  • This section provides details of the Erasmus Medical Center (EMC) Dutch clinical corpus annotated for the three contextual properties negation, temporality, and experiencer

Read more

Summary

Introduction

In order to extract meaningful information from electronic medical records, such as signs and symptoms, diagnoses, and treatments, it is important to take into account the contextual properties of the identified information: negation, temporality, and experiencer. To adapt the ConText algorithm, we translated English trigger terms to Dutch and added several general and document specific enhancements, such as negation rules for general practitioners’ entries and a regular expression based temporality module. Recent years have seen an increase in the use of electronic medical records (EMRs) by healthcare providers [1] These records contain patient-related information such as signs, (patient-reported) symptoms, diagnoses, treatments, and tests. The system was evaluated on discharge summaries where it achieved a precision of 84.5% and a recall of 77.8% Another system, called NegFinder [10], used grammatical parsing and regular expressions to identify negated patterns occurring in medical narratives, achieving a specificity of 97.7% and a sensitivity (or recall) of 95.3% on discharge summaries and surgical notes. The system achieved a precision of 85% and a recall of 86% on clinical notes

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call