Abstract

We present IceMorph, a semi-supervised morphosyntactic analyzer of Old Icelandic. In addition to machine-read corpora and dictionaries, it applies a small set of declension prototypes to map corpus words to dictionary entries. A web-based GUI allows expert users to modify and augment data through an online process. A machine learning module incorporates prototype data, edit-distance metrics, and expert feedback to continuously update part-of-speech and morphosyntactic classification. An advantage of the analyzer is its ability to achieve competitive classification accuracy with minimum training data.

Highlights

  • IceMorph [1] is a semi-supervised part-of-speech (POS) and morphosyntactic (MS) tagger for Old Icelandic

  • Old Icelandic is a difficult language to tag for morphosyntactic features given its inflectional and morphonological complexity

  • IceMorph is designed to achieve competitive classification accuracy using a minimum of cleanly tagged training data, and to allow for continuous online retraining

Read more

Summary

Introduction

IceMorph [1] is a semi-supervised part-of-speech (POS) and morphosyntactic (MS) tagger for Old Icelandic. The IceMorph system consists of a number of interacting modules, including a Perl machine parser for Old Icelandic dictionaries, a prototype-based inflection generator coded in Haskell based on similar tools used in Functional Morphology [11,12,22], an edit distance classifier, a website to collect feedback from human experts, and a context-based machine learning algorithm for grammatical disambiguation. We hypothesize that this multi-pronged approach can offer better outcomes than any one of the approaches alone to the vexing problem of morphological analysis in Old Icelandic. This may seem to be an obvious solution for the problem of POS and MS tagging in a language that has a complex morphology and for which there is a paucity of clean training data and a noisy target corpus, we have not encountered similar multi-pronged approaches to this problem for Old Icelandic

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call