Abstract

Scholarship on underresourced languages bring with them a variety of challenges which make access to the full spectrum of source materials and their evaluation difficult. For Coptic in particular, large scale analyses and any kind of quantitative work become difficult due to the fragmentation of manuscripts, the highly fusional nature of an incorporational morphology, and the complications of dealing with influences from Hellenistic era Greek, among other concerns. Many of these challenges, however, can be addressed using Digital Humanities tools and standards. In this paper, we outline some of the latest developments in Coptic Scriptorium, a DH project dedicated to bringing Coptic resources online in uniform, machine readable, and openly available formats. Collaborative web-based tools create online 'virtual departments' in which scholars dispersed sparsely across the globe can collaborate, and natural language processing tools counterbalance the scarcity of trained editors by enabling machine processing of Coptic text to produce searchable, annotated corpora.

Highlights

  • Small but data-rich fields of research bring with them a variety of challenges which make access to the full spectrum of source materials and their evaluation difficult

  • Coptic is the last phase of the ancient Egyptian language family, a language that came into use in the Roman period of Egypt’s history and derives from the more ancient language of the hieroglyphs

  • Prior to the launch of Coptic Scriptorium, three major digital resources for literary Coptic texts existed, each making important advances in the field, none providing a collaborative environment for digitization, annotation, and open access publication

Read more

Summary

INTRODUCTION

Small but data-rich fields of research bring with them a variety of challenges which make access to the full spectrum of source materials and their evaluation difficult. Due to the colonial history of Egypt, Journal of Data Mining and Digital Humanities ISSN 2416-5999, an open-access journal http://jdmdh.episciences.org many Coptic texts are unpublished, fragmentary, or dismembered—preserved fragment by fragment in different libraries around the globe. Many of these challenges can be addressed using Digital Humanities tools and methods. Located at copticscriptorium.org, it is an interdisciplinary, collaborative digital project dedicated to bringing Coptic cultural heritage resources online in machine readable and openly available formats [http; Schroeder & Zeldes, et al, 2013-].1. 4. Producing open, linkable data for a growing digital ecosystem in Coptic studies and the larger context of digital humanities resources for the ancient world. This paper focuses on areas 1 & 2, demonstrating how a small field for an under-resourced language can leverage diverse, interdisciplinary methods to produce open corpora for research and cultural heritage preservation

THE NEED FOR A MULTIDISCIPLINARY DIGITAL COPTIC RESEARCH ENVIRONMENT
COLLABORATIVE ANNOTATION TOOLS
NATURAL LANGUAGE PROCESSING
CONCLUSIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call