Studying the correlation between different word sense disambiguation methods and summarization effectiveness in biomedical texts

Laura Plaza,Alberto Díaz,Alan R Aronson,Antonio J Jimeno-Yepes

doi:10.1186/1471-2105-12-355

Abstract

BackgroundWord sense disambiguation (WSD) attempts to solve lexical ambiguities by identifying the correct meaning of a word based on its context. WSD has been demonstrated to be an important step in knowledge-based approaches to automatic summarization. However, the correlation between the accuracy of the WSD methods and the summarization performance has never been studied.ResultsWe present three existing knowledge-based WSD approaches and a graph-based summarizer. Both the WSD approaches and the summarizer employ the Unified Medical Language System (UMLS) Metathesaurus as the knowledge source. We first evaluate WSD directly, by comparing the prediction of the WSD methods to two reference sets: the NLM WSD dataset and the MSH WSD collection. We next apply the different WSD methods as part of the summarizer, to map documents onto concepts in the UMLS Metathesaurus, and evaluate the summaries that are generated. The results obtained by the different methods in both evaluations are studied and compared.ConclusionsIt has been found that the use of WSD techniques has a positive impact on the results of our graph-based summarizer, and that, when both the WSD and summarization tasks are assessed over large and homogeneous evaluation collections, there exists a correlation between the overall results of the WSD and summarization tasks. Furthermore, the best WSD algorithm in the first task tends to be also the best one in the second. However, we also found that the improvement achieved by the summarizer is not directly correlated with the WSD performance. The most likely reason is that the errors in disambiguation are not equally important but depend on the relative salience of the different concepts in the document to be summarized.

Highlights

Word sense disambiguation (WSD) attempts to solve lexical ambiguities by identifying the correct meaning of a word based on its context
Among the unsupervised WSD methods we find journal descriptor indexing (JDI) [22], disambiguation based on concept profiles [23], disambiguation based on context examples collected automatically [24] and graphbased approaches [25]
It must be noted that the Journal Descriptor Indexing (JDI) algorithm performs well with the NLM WSD subset, where all candidate senses of the ambiguous terms are assigned different semantic types, so that JDI is able to distinguish between possible senses

Summary

Introduction

Word sense disambiguation (WSD) attempts to solve lexical ambiguities by identifying the correct meaning of a word based on its context. WSD has been demonstrated to be an important step in knowledgebased approaches to automatic summarization. Word sense disambiguation (WSD) is an open problem of natural language processing (NLP) aimed at resolving lexical ambiguities by identifying the correct meaning of a word based on its context. A word is ambiguous when it has more than one sense (e.g. the word “cold”, when used as a noun, may refer both to a respiratory disorder and to the absence of heat) It is the context in which the word is used that determines its correct meaning. WSD has been demonstrated to be an important step in knowledge-based approaches to automatic summarization [10]. As stated by Shooshan et al [14], the UMLS Metathesaurus contains a significant amount of ambiguity, and selecting the wrong mapping may bias the selection of salient information to sentences containing the wrong concepts, while discarding sentences containing the right ones

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Aug 26, 2011
Citations: 52	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

Studying the correlation between different word sense disambiguation methods and summarization effectiveness in biomedical texts

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

The impact of learning Unified Medical Language System knowledge embeddings in relation extraction from biomedical texts
Maxwell A Weinzierl ... Sanda M Harabagiu
Journal of the American Medical Informatics Association | VOL. 27
Maxwell A Weinzierl, et. al.Maxwell A Weinzierl ... Sanda M Harabagiu
01 Oct 2020
Journal of the American Medical Informatics Association | VOL. 27

Graph-based Word Sense Disambiguation of biomedical documents
Eneko Agirre ... Mark Stevenson
Bioinformatics | VOL. 26
Eneko Agirre, et. al.Eneko Agirre ... Mark Stevenson
07 Oct 2010
Bioinformatics | VOL. 26

Exploiting MeSH indexing in MEDLINE to generate a data set for word sense disambiguation
Antonio J Jimeno-Yepes ... Bridget T McInnes
BMC Bioinformatics | VOL. 12
Antonio J Jimeno-Yepes, et. al.Antonio J Jimeno-Yepes ... Bridget T McInnes
02 Jun 2011
BMC Bioinformatics | VOL. 12

Word Sense Disambiguation
Pushpak Bhattacharyya ... Mitesh Khapra
-
Pushpak Bhattacharyya, et. al.Pushpak Bhattacharyya ... Mitesh Khapra
01 Jan 2013
01 Jan 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Studying the correlation between different word sense disambiguation methods and summarization effectiveness in biomedical texts

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics