Bringing Semantics into Historical Archives with Computer-aided Rich Metadata Generation

Davide Colla,Diego Magro,Marco Leontino,Annamaria Goy,Claudia Picardi

doi:10.1145/3484398

Abstract

This article relies on the idea that a semantically rich metadata layer is required in order to provide an effective, intelligent, and engaging access to historical archives. However, building such a semantic layer represents a well-known bottleneck that can be overcome only by a hybrid strategy, integrating user-generated content and automatic techniques. The PRiSMHA project provides a contribution in this direction with the design and development of the prototype of an ontology-driven platform supporting users in semantic metadata generation. In particular, the main contribution of this article is to show how automatic information extraction techniques (namely, Named Entity and Temporal Expression Recognition) and information retrieved from external datasets in the LOD cloud can support users in the identification and characterization of new entities to annotate documents with.

Full Text