Abstract

The research question this paper aims at answering is the following: In an ontology-driven annotation system, can the information extracted from external resources (namely, Wikidata) provide users with useful suggestions in the characterization of entities used for the annotation of documents from historical archives? The context of the research is the PRiSMHA project, in which the main goal is the development of a proof-of-concept prototype ontology-driven system for semantic metadata generation. The assumption behind this effort is that an effective access to historical archives needs a rich semantic knowledge, relying on a domain ontology, that describes the content of archival resources. In the paper, we present a new feature of the annotation system: when characterizing a new entity (e.g., a person), some properties describing it are automatically pre-filled in, and more complex semantic representations (e.g., events the entity is involved in) are suggested; both kinds of suggestions are based on information retrieved from Wikidata. In the paper, we describe the automatic algorithm devised to support the definition of the mappings between the Wikidata semantic model and the PRiSMHA ontology, as well as the process used to extract information from Wikidata and to generate suggestions based on the defined mappings. Finally, we discuss the results of a qualitative evaluation of the suggestions, which provides a positive answer to the initial research question and indicates possible improvements.

Highlights

  • Digital Cultural Heritage has become a key concept for all kinds of cultural institutions, whether they are museums, libraries, archives, or cultural centers

  • We evaluated the double support with users, and the results showed that Named Entity Recognition (NER) and entity linking to Linked Open Data (LOD) support users in the annotation activity in an effective way; in particular, users appreciated the pre-filling of category and label fields in the form for creating a new entity

  • We provide some details about how HERO and HERO-900 model the semantic representation of an entity, focusing on those aspects that are relevant to the suggestion mechanism that we will describe in Sections 3 and 4

Read more

Summary

Introduction

Digital Cultural Heritage has become a key concept for all kinds of cultural institutions, whether they are museums, libraries, archives, or cultural centers. It is a pillar of EU research programs The term “digital” itself has a large and sometimes fuzzy meaning, ranging from the digitization of paper documents to virtual reality applications, to online communication and marketing activities. We will focus on cultural institutions that host historical archives including documents such as newspaper articles, pictures, typewritten leaflets, manuscripts, and posters.

Objectives
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call