Knowledge-enhanced biomedical named entity recognition and normalization: application to proteins and genes

Huiwei Zhou,Zhe Liu,Chengkun Lang,Bizun Lei,Zhuang Liu,Shixian Ning

doi:10.1186/s12859-020-3375-3

Huiwei Zhou, Zhe Liu + Show 4 more

Open Access

https://doi.org/10.1186/s12859-020-3375-3

Copy DOI

Journal: BMC Bioinformatics	Publication Date: Jan 30, 2020
Citations: 19	License type: open-access

Affiliation: Dalian University of Technology

Abstract

BackgroundAutomated biomedical named entity recognition and normalization serves as the basis for many downstream applications in information management. However, this task is challenging due to name variations and entity ambiguity. A biomedical entity may have multiple variants and a variant could denote several different entity identifiers.ResultsTo remedy the above issues, we present a novel knowledge-enhanced system for protein/gene named entity recognition (PNER) and normalization (PNEN). On one hand, a large amount of entity name knowledge extracted from biomedical knowledge bases is used to recognize more entity variants. On the other hand, structural knowledge of entities is extracted and encoded as identifier (ID) embeddings, which are then used for better entity normalization. Moreover, deep contextualized word representations generated by pre-trained language models are also incorporated into our knowledge-enhanced system for modeling multi-sense information of entities. Experimental results on the BioCreative VI Bio-ID corpus show that our proposed knowledge-enhanced system achieves 0.871 F1-score for PNER and 0.445 F1-score for PNEN, respectively, leading to a new state-of-the-art performance.ConclusionsWe propose a knowledge-enhanced system that combines both entity knowledge and deep contextualized word representations. Comparison results show that entity knowledge is beneficial to the PNER and PNEN task and can be well combined with contextualized information in our system for further improvement.

Highlights

Automated biomedical named entity recognition and normalization serves as the basis for many downstream applications in information management
We propose a novel knowledge-enhanced system that could employ rich entity knowledge and deep contextual word representations for protein/gene named entity recognition (PNER) and normalization (PNEN)
Experiment setup Dataset Our experiments are conducted on the corpus published by BioCreative VI Bio-ID Track1 [3], which is drawn from annotated figure panel captions from SourceData [24] and is converted into BioC format along with the corresponding full text articles

Summary

Introduction

Automated biomedical named entity recognition and normalization serves as the basis for many downstream applications in information management. This task is challenging due to name variations and entity ambiguity. With the rapid development of computer technology and biotechnology, the number of biomedical literature is growing rapidly. New methods and tools need to be developed to support more effective and consistent extraction of biomedical entities and their IDs, For this purpose, the BioCreative VI Track 1 proposed a challenging task (called Bio-ID Assignment), which focused on entity tagging and ID assignment [3]. The first subtask aimed at automatically recognizing biomedical entities and their types from texts; and the second subtask was to associate entity mentions in texts with their corresponding common IDs in knowledge bases

Objectives

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Knowledge-enhanced biomedical named entity recognition and normalization: application to proteins and genes

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Joint Learning for Biomedical NER and Entity Normalization: Encoding Schemes, Counterfactual Examples, and Zero-Shot Evaluation.
Jiho Noh ... Ramakanth Kavuluru
ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine | VOL. 2021
Jiho Noh, et. al.Jiho Noh ... Ramakanth Kavuluru
01 Aug 2021
01 Aug 2021

A Hybrid Approach for French Medical Entity Recognition and Normalization
Allaouzi Imane ...
-
Allaouzi Imane, et. al.Allaouzi Imane ...
01 Jan 2018
01 Jan 2018

BERN2: an advanced neural biomedical named entity recognition and normalization tool.
Mujeen Sung ... Donghyeon Kim
Bioinformatics | VOL. 38
Mujeen Sung, et. al.Mujeen Sung ... Donghyeon Kim
02 Sep 2022
Bioinformatics | VOL. 38

BioALBERT: A Simple and Effective Pre-trained Language Model for Biomedical Named Entity Recognition
Usman Naseem ... Vinay Reddy
-
Usman Naseem, et. al.Usman Naseem ... Vinay Reddy
18 Jul 2021
18 Jul 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Knowledge-enhanced biomedical named entity recognition and normalization: application to proteins and genes

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics