Abstract

BackgroundBiomedical named entity recognition (BioNER) is a basic and important medical information extraction task to extract medical entities with special meaning from medical texts. In recent years, deep learning has become the main research direction of BioNER due to its excellent data-driven context coding ability. However, in BioNER task, deep learning has the problem of poor generalization and instability.Resultswe propose the hierarchical shared transfer learning, which combines multi-task learning and fine-tuning, and realizes the multi-level information fusion between the underlying entity features and the upper data features. We select 14 datasets containing 4 types of entities for training and evaluate the model. The experimental results showed that the F1-scores of the five gold standard datasets BC5CDR-chemical, BC5CDR-disease, BC2GM, BC4CHEMD, NCBI-disease and LINNAEUS were increased by 0.57, 0.90, 0.42, 0.77, 0.98 and − 2.16 compared to the single-task XLNet-CRF model. BC5CDR-chemical, BC5CDR-disease and BC4CHEMD achieved state-of-the-art results.The reasons why LINNAEUS’s multi-task results are lower than single-task results are discussed at the dataset level.ConclusionCompared with using multi-task learning and fine-tuning alone, the model has more accurate recognition ability of medical entities, and has higher generalization and stability.

Highlights

  • Biomedical named entity recognition (BioNER) is a basic and important medical information extraction task to extract medical entities with special meaning from medical texts

  • Mehmood and others [5] proposed multi-task learning based on Convolutional Neural Network (CNN) and Long Short-Term Memory Networks (LSTM) to improve the generalization of the model, but the results was difficult to go beyond single-task learning based on Transformers model and unstable

  • Inspired by the work of Mehmood et al, we proposed the Multi-tasking learning (MTL)-LS model

Read more

Summary

Introduction

Biomedical named entity recognition (BioNER) is a basic and important medical information extraction task to extract medical entities with special meaning from medical texts. Biomedical named entity recognition (BioNER) is a basic task in biomedical information extraction to extract interested entities such as diseases, drugs, genes/proteins from complex, unstructured medical texts [2]. With the efforts of many researchers, more and more deep learning networks have emerged, ranging from Convolutional Neural Network (CNN) [3], Long Short-Term Memory Networks (LSTM) [4], to Transformers-based BERT language models in BioNER. Mehmood and others [5] proposed multi-task learning based on CNN and LSTM to improve the generalization of the model, but the results was difficult to go beyond single-task learning based on Transformers model and unstable. We propose the hierarchical shared transfer learning, which combines multi-task learning with single-task learning, which allows the model to have high accuracy, and improves the generalization and stability of the model

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call