Abstract

In order to solve the problem of data sparseness and knowledge acquisition in translation disambiguation and WSD(word sense disambiguation),this paper introduces an unsupervised method,based on the n-gram language model and web mining.It is supposed that there exists a latent relationship between the word sense and n-gram language model.Based on this assumption,the mapping between the English translation of Chinese word and the DEF of Hownet is established and the word set is acquired.Then the probabilities of n-gram in the words set are calculated based on the query results of a searching engine.The disambiguation is performed via these probabilities.This method is evaluated on a gold standard Multilingual Chinese English Lexical Sample Task dataset.Experimental results show that the model gets the state-of-the-art results(Pmar=55.9%) and outperforms 12.8% on the best system in SemEval-2007.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call