Abstract

Advancements in cultural informatics have significantly influenced the way we perceive, analyze, communicate and understand culture. New data sources, such as social media, digitized cultural content, and Internet of Things (IoT) devices, have allowed us to enrich and customize the cultural experience, but at the same time have created an avalanche of new data that needs to be stored and appropriately managed in order to be of value. Although data management plays a central role in driving forward the cultural heritage domain, the solutions applied so far are fragmented, physically distributed, require specialized IT knowledge to deploy, and entail significant IT experience to operate even for trivial tasks. In this work, we present Hydria, an online data lake that allows users without any IT background to harvest, store, organize, analyze and share heterogeneous, multi-faceted cultural heritage data. Hydria provides a zero-administration, zero-cost, integrated framework that enables researchers, museum curators and other stakeholders within the cultural heritage domain to easily (i) deploy data acquisition services (like social media scrapers, focused web crawlers, dataset imports, questionnaire forms), (ii) design and manage versatile customizable data stores, (iii) share whole datasets or horizontal/vertical data shards with other stakeholders, (iv) search, filter and analyze data via an expressive yet simple-to-use graphical query engine and visualization tools, and (v) perform user management and access control operations on the stored data. To the best of our knowledge, this is the first solution in the literature that focuses on collecting, managing, analyzing, and sharing diverse, multi-faceted data in the cultural heritage domain and targets users without an IT background.

Highlights

  • In the last few years Cultural Informatics (CI) has surfaced as a new promising domain that constitutes the socio-technological approach to understand, represent, communicate and re-invent cultures and cultural institutions [1]

  • This system is the most conceptually and functionally similar work to the Hydria data lake; PATCH was designed for the needs of a specific project and applied to a particular research study, while our work is an online, free, zero-administration data lake that offers both fundamental and advanced user and data/knowledge management functionality in the cultural heritage domain, able to be customized for the requirements of any cultural heritage project, and addresses all users, without requiring any IT background/skills

  • Hydria enables the direct incorporation of heterogeneous data that has been recorded in dispersed formats, while specialized processing engines ingest data without compromising the data structure, making it available for tasks such as visualization, mining, analytics and reporting

Read more

Summary

Introduction

In the last few years Cultural Informatics (CI) has surfaced as a new promising domain that constitutes the socio-technological approach to understand, represent, communicate and re-invent cultures and cultural institutions [1]. Reconfiguring an existing solution for reuse in another setup or setting one up from scratch for the specific cultural data management problem requires (i) time-consuming meetings between scientists of different disciplines trying to understand each other’s needs and goals, and (ii) resource-consuming IT infrastructure that calls for outsourcing to IT specialists and regular maintenance/upgrades to keep up with technological requirements [42] Due to these issues, a great number of stakeholders (such as small museums or humanities research groups) that lack the resources for infrastructure and/or computing expertise still rely on outdated approaches like (i) storing their data in spreadsheets or raw files, (ii) sharing their data with colleagues through email, cloud uploads of zip files, or even by snail mailing electronic copies in removable media, and (iii) analyzing the data via sub-standard tools and trial software.

Related Work
Social Data Management in the Cultural Heritage Domain
Information Systems for Cultural Heritage
Information Systems for Museums
System Architecture
The Data Acquisition Module
The Data Harvesting Submodule
The Structured Data Input Submodule
The Data Management Module
The Data Analysis Module
The User Management Module
Implementation Aspects
The TripMentor Case Study
Data Harvesting
Facebook Spiders
TripAdvisor Spiders
Reusing Data Ponds and Data Pond Templates
Visualizing Information
Hydria for Curators
Hydria for Researchers
Hydria for Data Scientists
Hydria for End Users
Conclusions and Outlook
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call