Abstract

Purpose– The purpose of this paper is to describe the creation and exploitation of a historical corpus in an attempt to contribute to the preservation and availability of cultural heritage documents.Design/methodology/approach– At first, the digitization process and attempts to the availability and awareness of the books and manuscripts in a historical library in Greece are presented. Then, processing and exploitation, taking into account natural language processing techniques of the digitized corpus, are discussed.Findings– In the course of the project, methods that take into account the state of the documents and the particularities of the Greek language were developed.Practical implications– In its present state, the use of the corpus facilitates the work of theologians, historians, philologists, paleographers, etc. and in the same time, prevents the original documents from further damage.Originality/value– The results of this undertaking can give useful insights as for the creation of corpora of cultural heritage documents and as for the methods for the processing and exploitation of the digitized documents which take into account the language in which the documents are written.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call