Word sense disambiguation for event trigger word detection in biomedicine

David Martinez,Timothy Baldwin

doi:10.1186/1471-2105-12-s2-s4

David Martinez, Timothy Baldwin

Open Access

https://doi.org/10.1186/1471-2105-12-s2-s4

Copy DOI

Journal: BMC Bioinformatics	Publication Date: Mar 29, 2011
Citations: 36	License type: cc-by

Affiliation: Data61, University of Melbourne

Abstract

This paper describes a method for detecting event trigger words in biomedical text based on a word sense disambiguation (WSD) approach. We first investigate the applicability of existing WSD techniques to trigger word disambiguation in the BioNLP 2009 shared task data, and find that we are able to outperform a traditional CRF-based approach for certain word types. On the basis of this finding, we combine the WSD approach with the CRF, and obtain significant improvements over the standalone CRF, gaining particularly in recall.

Highlights

In recent years, the biomedical text processing field has created many annotated resources to further the development of automatic text analysis methods through standardised evaluation
This paper describes a method for detecting event trigger words in biomedical text based on a word sense disambiguation (WSD) approach
As a baseline we present the Majority Class (MC) classifier, which assigns the most frequent class seen in the training data to all the test instances

Summary

Introduction

The biomedical text processing field has created many annotated resources to further the development of automatic text analysis methods through standardised evaluation. Negative regulation http://www.biomedcentral.com/1471-2105/12/S2/S4 the top ranked system, but the precision was much lower. Their performance for trigger word detection is not known. The most successful approaches are supervised systems that build a separate model for each word type and POS, learning only from contexts that include it. The motivation behind this approach is the “one sense per collocation” heuristic [15], which observes that the meaning of a given word in a particular collocation tends to be invariant across all token occurrences

Results

Discussion

Conclusion