T2NER: Transformers based Transfer Learning Framework for Named Entity Recognition

Saadullah Amin,Guenter Neumann

doi:10.18653/v1/2021.eacl-demos.25

Abstract

Recent advances in deep transformer models have achieved state-of-the-art in several natural language processing (NLP) tasks, whereas named entity recognition (NER) has traditionally benefited from long-short term memory (LSTM) networks. In this work, we present a Transformers based Transfer Learning framework for Named Entity Recognition (T2NER) created in PyTorch for the task of NER with deep transformer models. The framework is built upon the Transformers library as the core modeling engine and supports several transfer learning scenarios from sequential transfer to domain adaptation, multi-task learning, and semi-supervised learning. It aims to bridge the gap between the algorithmic advances in these areas by combining them with the state-of-the-art in transformer models to provide a unified platform that is readily extensible and can be used for both the transfer learning research in NER, and for real-world applications. The framework is available at: https://github.com/suamin/t2ner.

Highlights

Named entity recognition (NER) is an important task in information extraction, benefiting the downstream applications such as entity linking (Cucerzan, 2007), relation extraction (Culotta and Sorensen, 2004) and question answering (Krishnamurthy and Mitchell, 2015)
We present an adaptable and userfriendly development framework for growing research in transfer learning with deep transformer models for named entity recognition (NER), with underexplored areas such as semi-supervised learning
In this work we presented a transformer based framework for transfer learning research in named entity recognition (NER)

Summary

Introduction

Named entity recognition (NER) is an important task in information extraction, benefiting the downstream applications such as entity linking (Cucerzan, 2007), relation extraction (Culotta and Sorensen, 2004) and question answering (Krishnamurthy and Mitchell, 2015). NER models have shown relatively high variance even when trained on the same data (Reimers and Gurevych, 2017) These models generalize poorly when tested on data from different domains and languages, and even more so when they contain unseen entity mentions (Augenstein et al, 2017; Agarwal et al, 2020; Wang et al, 2020). Recent successes in transfer learning have mainly come from pre-trained language models (Devlin et al, 2019; Radford et al, 2019) with contextualized word embeddings based on deep transformer models (Vaswani et al, 2017) These models achieve state-of-the-art in several NLP tasks such as named entity recognition, document classification, and question answering.

Design Principles

Data Sources

Data Readers

Models

Criterions

Auxiliary Tasks

Trainers

Conclusion and Future Work