APPLYING SIMILARITY MEASURES FOR AUTOMATIC LEMMATIZATION: A CASE STUDY FOR MODERN GREEK AND ENGLISH

Dimitrios P Lyras,Kyriakos N Sgarbas,Nikolaos D Fakotakis

doi:10.1142/s021821300800428x

Abstract

This paper addresses the problem of automatic induction of the normalized form (lemma) of regular and mildly irregular words with no direct supervision using language-independent algorithms. More specifically, two string distance metric models (i.e. the Levenshtein Edit Distance algorithm and the Dice Coefficient similarity measure) were employed in order to deal with the automatic word lemmatization task by combining two alignment models based on the string similarity and the most frequent inflectional suffixes. The performance of the proposed model has been evaluated quantitatively and qualitatively. Experiments were performed for the Modern Greek and English languages and the results, which are set within the state-of-the-art, have showed that the proposed model is robust (for a variety of languages) and computationally efficient. The proposed model may be useful as a pre-processing tool to various language engineering and text mining applications such as spell-checkers, electronic dictionaries, morphological analyzers etc.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

APPLYING SIMILARITY MEASURES FOR AUTOMATIC LEMMATIZATION: A CASE STUDY FOR MODERN GREEK AND ENGLISH

Abstract

Talk to us

Similar Papers

More From: International Journal on Artificial Intelligence Tools

Lead the way for us

Journal: International Journal on Artificial Intelligence Tools	Publication Date: Oct 1, 2008
Citations: 11

Similar Papers

19th IEEE International Conference on Tools with Artificial Intelligence - Copyright
...
-
, et. al. ...
01 Oct 2007
01 Oct 2007

Diachronic Semantic and Morphological Analysis of Abstract Noun Doublets of Norman-French and Anglo-Saxon Origin
Antonija Saric ... Krunoslav Pavlovic
International Journal of Linguistics | VOL. 14
Antonija Saric, et. al.Antonija Saric ... Krunoslav Pavlovic
28 Apr 2022
International Journal of Linguistics | VOL. 14

The Principles of Economy in Word-Formation in Functional Styles of English
Saurbayev Rishat Zhurkenovich ... Zhetpisbay Aliya Kozhamuratkyzy
Arab World English Journal | VOL. 12
Saurbayev Rishat Zhurkenovich, et. al.Saurbayev Rishat Zhurkenovich ... Zhetpisbay Aliya Kozhamuratkyzy
15 Jun 2021
Arab World English Journal | VOL. 12

Semantic transformations of maritime historicisms in modern English
A E Fedotova
Vestnik of Samara University. History, pedagogics, philology | VOL. 26
A E FedotovaA E Fedotova
27 Mar 2020
Vestnik of Samara University. History, pedagogics, philology | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

APPLYING SIMILARITY MEASURES FOR AUTOMATIC LEMMATIZATION: A CASE STUDY FOR MODERN GREEK AND ENGLISH

Abstract

Talk to us

Similar Papers

More From: International Journal on Artificial Intelligence Tools