Abstract

Word sense disambiguation (WSD) is considered an AI complete problem which may be defined as the ability to resolve the intended meaning of ambiguous words occurring in a language. Language has complex structure and is highly ambiguous which has deep rooted relations between its different components specifically words, sentences and paragraphs. Incidentally, human beings can easily comprehend and resolve the intended meanings of the ambiguous words. The difficulty arises in building a highly accurate machine translation system or information retrieval system because of ambiguity. A number of algorithms have been devised to solve ambiguity but the success rate of these algorithms are very much limited. Context might have played a decisive role in human judgment while deciphering the meaning of polysemic words. A significant number of psychological models have been proposed to emulate the way the human beings understand the meaning of words, sentences or text depending on the context. The pertinent question that the researchers want to address is how the meanings are represented by human beings in mental memory and whether it is feasible to simulate with a computational model. Latent Semantic Analysis (LSA), a mathematical technique which is effective in representation of meanings in the form of vectors that closely approximates human semantic space. By comparing the vectors in the LSA generated semantic space, the closest neighbours of the word vector can be derived which indirectly provides lot of information about a word. However, LSA does not provide a complete theory of meaning. That is why psychological process modules are combined with LSA to make the theory of meaning concrete. Predication algorithm with LSA was proposed by Kintch, 2001 which was sufficient to capture various word senses and was successful in homonym disambiguation. Meaning of a word might have multiple senses specifically verbs. For example, verb “run” has 42 senses in WordNet. In order to find the correct sense of a verb is really a daunting task and resolving verb ambiguity using psycholinguistic model is very much limited. The proposed method has exploited the high dimensional vector LSA space resulted from training samples by applying predication algorithm to derive the most appropriate semantic neighbours for the target polysemous verb from the semantic space. Finally the vector space of test samples are checked with the training samples i.e. semantic neighbours to classify the senses of polysemous words in accurate manner.

Highlights

  • Even though the research in Word Sense Disambiguation (WSD) has been carried out by researchers from 1940[1] onwards but still the problem is not resolved fully

  • Human beings are well organized to understand the meaning of ambiguous words, but in case of machines it requires a mechanism that will help the machine to find out the correct meaning of ambiguous words [2]

  • It can be concluded that these are the most probable surrounding terms in a context if ambiguous verb run is used with noun horse

Read more

Summary

Introduction

Even though the research in Word Sense Disambiguation (WSD) has been carried out by researchers from 1940[1] onwards but still the problem is not resolved fully. Ambiguous noun “plane”, “The plane flies like a bird in the sky” where the surrounding terms fly, bird, sky can help to recognize the ambiguous term „plane‟ is an aeroplane whereas for the example, “the plane is made of paper” where the term paper can identify that “plane” is a geometric plane. If these two examples are given as input text in a computer for machine translation, it is difficult to assume which sense of plane will be considered for translation. The word “piggy bank” is related to coin or money that means these terms help to find out the exact sense of bank as it is an ambiguous word

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.