Latin word stemming using Wiktionary:

Richard Khoury,Francesca Sapsford

doi:10.1093/llc/fqv008

Latin word stemming using Wiktionary:

Richard Khoury, Francesca Sapsford

https://doi.org/10.1093/llc/fqv008

Copy DOI

Journal: Digital Scholarship in the Humanities	Publication Date: Mar 30, 2015
Citations: 2

Affiliation: Lakehead University

#Thousands Of Words #Linguistic Tools + Show 6 more

Abstract
Full-Text PDF
Similar Papers

Abstract

This article demonstrates how to automatically build a Latin word stemmer to transform words into their grammatical roots. By using the Wiktionary database as source data, it becomes possible to build such a tool with several hundreds of thousands of words. Our experiments demonstrate that it can then be used to correctly find the root of 78% of the words of Martial’s Epigrams, and can be combined with other linguistic tools such as the Latin WordNet to greatly enhance their language coverage. While our research focuses on the Latin language, the same methodology could be used to build stemmers and other linguistic tools for many other ancient languages represented in Wiktionary, such as Ancient Greek or Old Armenian.

Full Text