Abstract

Named Entity Recognition (NER) is a vital step in medical information extraction, especially Electronic Health Records (EHRs). Proper extraction of medical entities such as disease and medications can automate the process of EHR coding as well as considerably improve the filtering of EHR resulting in better extraction of medical information. NER systems are generally trained and evaluated on relatively small standard datasets. However, they are applied on real-world applications, they exposed to different collection of texts, varying in topic, entity distribution, and text type (e.g. abstract vs. full text). This mismatch between the internal structure and application can cause drop in performance and consequently, unreliability. In this paper, we propose Med-Flair, an NER tagger covering mainly multiple entity types, medications and diseases. Med-Flair is mainly based on the Flair NLP framework, in addition, it’s integrated by adding Bidirectional A Long Short Term Memory network (BiLSTM) and Conditional Random Fields (CRF) for sequence tagger. To validate the performance of Med-Flair, it is tested on 4 benchmark datasets, two for medications entities and two for diseases entities. Med Flair successfully achieves high performance, as it achieves 92%, 88%, 92% and 95% of F1-score which are mostly highest compared to state of the art deep neural network architectures such as BioBERT, DTranNER, BERT and BioNerFlair.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call