Biomedical Mention Disambiguation using a Deep Learning Approach

Chih-Hsuan Wei,Robert Leaman,Zhiyong Lu,Kyubum Lee

doi:10.1145/3307339.3342162

Abstract

Automatically locating named entities in natural language text - named entity recognition - is an important task in the biomedical domain. Many named entity mentions are ambiguous between several bioconcept types, however, causing text spans to be annotated as more than one type when simultaneously recognizing multiple entity types. The straightforward solution is a rule-based approach applying a priority order based on the precision of each entity tagger (from highest to lowest). While this method is straightforward and useful, imprecise disambiguation remains a significant source of error. We address this issue by generating a partially labeled corpus of ambiguous concept mentions. We first collect named entity mentions from multiple human-curated databases (e.g. CTDbase, gene2pubmed), then correlate them with the text mined span from PubTator to provide the context where the mention appears. Our corpus contains more than 3 million concept mentions that ambiguous between one or more concept types in PubTator (? 3% of all mentions). We approached this task as a classification problem and developed a deep learning-based method which uses the semantics of the span being classified and the surrounding words to identify the most likely bioconcept type. More specifically, we develop a convolutional neural network (CNN) and along short-term memory (LSTM) network to respectively handle the semantic syntax features, then concatenate these within a fully connected layer for final classification. The priority ordering rule-based approach demonstrated F1-scores of 71.29% (micro-averaged) and 41.19% (macro-averaged), while the new disambiguation method demonstrated F1-scores of 91.94% (micro-averaged) and 85.42% (macro-averaged), a very substantial increase.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Biomedical Mention Disambiguation using a Deep Learning Approach

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Biomedical named entity recognition using deep neural networks with contextual information
Hyejin Cho ... Hyunju Lee
BMC Bioinformatics | VOL. 20
Hyejin Cho, et. al.Hyejin Cho ... Hyunju Lee
01 Dec 2019
BMC Bioinformatics | VOL. 20

A survey on Named Entity Recognition — datasets, tools, and methodologies
Basra Jehangir ... Rahul Agarwal
Natural Language Processing Journal | VOL. 3
Basra Jehangir, et. al.Basra Jehangir ... Rahul Agarwal
26 May 2023
Natural Language Processing Journal | VOL. 3

GRAM-CNN: a deep learning approach with local context for named entity recognition in biomedical text.
Qile Zhu ... Xiaolin Li
Bioinformatics | VOL. 34
Qile Zhu, et. al.Qile Zhu ... Xiaolin Li
20 Dec 2017
Bioinformatics | VOL. 34

Teaching Machines to Find Names
Raymond Chiong
-
Raymond ChiongRaymond Chiong
01 Jan 2009
01 Jan 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Biomedical Mention Disambiguation using a Deep Learning Approach

Abstract

Talk to us

Similar Papers