Abstract

Digitalisation of cultural heritage in Estonia has been in progress during recent years, and we will see expansive mass digitalisation of printed books and handwritten documents in the very near future. This situation and potential actualises questions of the usage of the literary heritage. In our paper, we consider benefits for digital literary research that arise from representing a literary text collection as an annotated language resource. We discuss the pilot project of creating a text corpus based on private letters between two Estonian avantgarde writers in the beginning of the 20th century. The advantages and possibilities of corpus query system KORP that we have chosen for representing and searching literary heritage DH corpora as a language resource are described. Challenges that the application of Natural Language Processing and Text and Data Mining imposes on the preparation and representation of texts are discussed along with benefits for the research.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.