Abstract

Most existing methods for biomedical entity recognition tasks rely on explicit feature engineering where many features are either specific to a particular task or depend on the output of other natural language processing tools. Neural architectures have shown across various domains that efforts for explicit feature design can be reduced. In this work, we propose a unified framework using a bi-directional long short-term memory network (BLSTM) for named entity recognition (NER) tasks in biomedical and clinical domains. Three important characteristics of the framework are as follows: (1) The model learns contextual as well as morphological features using two different BLSTMs in a hierarchy, (2) the model uses a first-order linear conditional random field (CRF) in its output layer in cascade of BLSTM to infer label or tag sequence, and (3) the model does not use any domain-specific features or dictionary, that is, in another words, the same set of features are used in the three NER tasks, namely, disease name recognition (Disease NER), drug name recognition (Drug NER), and clinical entity recognition (Clinical NER). We compare the performance of the proposed model with existing state-of-the-art models on the standard benchmark datasets of the three tasks. We show empirically that the proposed framework outperforms all existing models. We analyze the importance of CRF layer, different feature types, and word embedding obtained using character-based embedding. The error analysis of the model indicates that a major proportion of errors are due to difficulty in recognizing acronyms and nested forms of entity names.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.