A Comparative Study to Determine the Effective Window Size of Turkish Word Sense Disambiguation Systems

Bahar İlgen,A Cüneyd Tantuğ,Eşref Adalı

doi:10.1007/978-3-319-01604-7_17

Abstract

In this paper, the effect of different windowing schemes on word sense disambiguation accuracy is presented. Turkish Lexical Sample Dataset has been used in the experiments. We took the samples of ambiguous verbs and nouns of the dataset and used bag-of-word properties as context information. The experi-ments have been repeated for different window sizes based on several machine learning algorithms. We follow 2/3 splitting strategy (2/3 for training, 1/3 for test-ing) and determine the most frequently used words in the training part. After re-moving stop words, we repeated the experiments by using most frequent 100, 75, 50 and 25 content words of the training data. Our findings show that the usage of most frequent 75 words as features improves the accuracy in results for Turkish verbs. Similar results have been obtained for Turkish nouns when we use the most frequent 100 words of the training set. Considering this information, selected al-gorithms have been tested on varying window sizes {30, 15, 10 and 5}. Our find-ings show that Naïve Bayes and Functional Tree methods yielded better accuracy results. And the window size $$\pm $$ 5 gives the best average results both for noun and the verb groups. It is observed that the best results of the two groups are 65.8 and 56 % points above the most frequent sense baseline of the verb and noun groups respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Comparative Study to Determine the Effective Window Size of Turkish Word Sense Disambiguation Systems

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

The effect of word sense disambiguation accuracy on literature based discovery.
Judita Preiss ... Mark Stevenson
BMC Medical Informatics and Decision Making | VOL. Suppl 16 1
Judita Preiss, et. al.Judita Preiss ... Mark Stevenson
01 Jul 2016
BMC Medical Informatics and Decision Making | VOL. Suppl 16 1

A grammatical study of the dialogue passages of the novel, "Nga Ba," by Maung Htin.

-

01 Jan 1970
01 Jan 1970

Word sense disambiguation using OntoNotes
Zhi Zhong ... Hwee Tou Ng
-
Zhi Zhong, et. al.Zhi Zhong ... Hwee Tou Ng
01 Jan 2008
01 Jan 2008

Group Structure: The Other Groups
N F Blake
-
N F BlakeN F Blake
01 Jan 1990
01 Jan 1990

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Comparative Study to Determine the Effective Window Size of Turkish Word Sense Disambiguation Systems

Abstract

Talk to us

Similar Papers