Abstract

This article demonstrates how to automatically build a Latin word stemmer to transform words into their grammatical roots. By using the Wiktionary database as source data, it becomes possible to build such a tool with several hundreds of thousands of words. Our experiments demonstrate that it can then be used to correctly find the root of 78% of the words of Martial’s Epigrams, and can be combined with other linguistic tools such as the Latin WordNet to greatly enhance their language coverage. While our research focuses on the Latin language, the same methodology could be used to build stemmers and other linguistic tools for many other ancient languages represented in Wiktionary, such as Ancient Greek or Old Armenian.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call