Abstract

Significant growth in Electronic Health Records (EHR) over the last decade has provided an abundance of clinical text that is mostly unstructured and untapped. This huge amount of clinical text data has motivated the development of new information extraction and text mining techniques. Named Entity Recognition (NER) and Relationship Extraction (RE) are key components of information extraction tasks in the clinical domain. In this paper, we highlight the present status of clinical NER and RE techniques in detail by discussing the existing proposed NLP models for the two tasks and their performances and discuss the current challenges. Our comprehensive survey on clinical NER and RE encompass current challenges, state-of-the-art practices, and future directions in information extraction from clinical text. This is the first attempt to discuss both of these interrelated topics together in the clinical context. We identified many research articles published based on different approaches and looked at applications of these tasks. We also discuss the evaluation metrics that are used in the literature to measure the effectiveness of the two these NLP methods and future research directions.

Highlights

  • The amount of text generated every day is increasing drastically in different domains such as health care, news articles, scientific literature, and social media

  • Out of the various text mining tasks and techniques, our goal in this paper is to review the current state-of-theart in Clinical Named Entity Recognition (NER) and Relationship Extraction (RE)-based techniques

  • We discovered that there is very limited work on NER and RE in the radiation oncology domain; we did notice that there are a plethora of publications in using NER and RE in the clinical text in general

Read more

Summary

Introduction

The amount of text generated every day is increasing drastically in different domains such as health care, news articles, scientific literature, and social media. Out of the various text mining tasks and techniques, our goal in this paper is to review the current state-of-theart in Clinical Named Entity Recognition (NER) and Relationship Extraction (RE)-based techniques. Clinical NER is a natural language processing (NLP) method used for extracting important medical concepts and events i.e., clinical NEs from the data [4]. Many toolkits and applications have been introduced to address different NLP tasks in the clinical domain, including NER and RE. The WEKA Data Mining Software [5] first came into existence in the late nineties It was updated several times over the years to include NLP systems for language identification, tokenization, sentence boundary detection, and named entity recognition. A highlevel overview of machine learning, neural networks, and evaluation metrics is presented below before we review clinical NER- and RE-related tasks

Objectives
Methods
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call