Using name-internal and contextual features to classify biological terms

Manabu Torii,Sachin Kamboj,K Vijay-Shanker

doi:10.1016/j.jbi.2004.08.007

Manabu Torii, Sachin Kamboj + Show 1 more

https://doi.org/10.1016/j.jbi.2004.08.007

Copy DOI

Journal: Journal of Biomedical Informatics	Publication Date: Sep 25, 2004
Citations: 44	License type: elsevier-specific: oa user license

Affiliation: University of Delaware

Abstract

There has been considerable work done recently in recognizing named entities in biomedical text. In this paper, we investigate the named entity classification task, an integral part of the named entity extraction task. We focus on the different sources of information that can be utilized for classification, and note the extent to which they are effective in classification. To classify a name, we consider features that appear within the name as well as nearby phrases. We also develop a new strategy based on the context of occurrence and show that they improve the performance of the classification system. We show how our work relates to previous works on named entity classification in the biological domain as well as to those in generic domains. The experiments were conducted on the GENIA corpus Ver. 3.0 developed at University of Tokyo. We achieve f value of 86 in 10-fold cross validation evaluation on this corpus.

Full Text