Transfer Learning for Named Entity Recognition in Financial and Biomedical Documents

Sumam Francis,Jordy Van Landeghem,Marie-Francine Moens

doi:10.3390/info10080248

Abstract

Recent deep learning approaches have shown promising results for named entity recognition (NER). A reasonable assumption for training robust deep learning models is that a sufficient amount of high-quality annotated training data is available. However, in many real-world scenarios, labeled training data is scarcely present. In this paper we consider two use cases: generic entity extraction from financial and from biomedical documents. First, we have developed a character based model for NER in financial documents and a word and character based model with attention for NER in biomedical documents. Further, we have analyzed how transfer learning addresses the problem of limited training data in a target domain. We demonstrate through experiments that NER models trained on labeled data from a source domain can be used as base models and then be fine-tuned with few labeled data for recognition of different named entity classes in a target domain. We also witness an interest in language models to improve NER as a way of coping with limited labeled data. The current most successful language model is BERT. Because of its success in state-of-the-art models we integrate representations based on BERT in our biomedical NER model along with word and character information. The results are compared with a state-of-the-art model applied on a benchmarking biomedical corpus.

Highlights

Lack of sufficient annotated data often limits the applicability of deep learning (DL) models to real life problems
Transfer learning setting C (TL-C): In Figure 2c we study the impact of using the BERT embeddings along with character and word-character level representations when transferring from one task to another
We have proposed a number of deep learning models for named entity recognition that rely on character based models

Summary

Introduction

Lack of sufficient annotated data often limits the applicability of deep learning (DL) models to real life problems. In this work we focus on a generic named entity recognition (NER) system that uses the representation learning capability of deep neural networks. NER refers to a subtask of information extraction in which entity mentions in an unstructured text are semantically labeled into pre-defined categories In this paper NER is evaluated with two main use cases—extraction of entity names from financial documents and from biomedical documents. The aim of the first use case is to develop a generic NER deep learning system that is capable of recognizing entities in business documents including invoices, business forms and emails

Objectives

Methods

Results

Discussion

Conclusion