Providing Digital Infrastructure for Audio-Visual Linguistic Research Data with Diverse Usage Scenarios: Lessons Learnt

Hanna Hedeland

doi:10.3390/publications8020033

Hanna Hedeland

Open Access

PDF Available

https://doi.org/10.3390/publications8020033

Copy DOI

Export

Save

Cite

Journal: Publications	Publication Date: Jun 11, 2020
Citations: 1	License type: CC BY 4.0

Affiliation: Universität Hamburg

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

This article describes the development of the digital infrastructure at a research data centre for audio-visual linguistic research data, the Hamburg Centre for Language Corpora (HZSK) at the University of Hamburg in Germany, over the past ten years. The typical resource hosted in the HZSK Repository, the core component of the infrastructure, is a collection of recordings with time-aligned transcripts and additional contextual data, a spoken language corpus. Since the centre has a thematic focus on multilingualism and linguistic diversity and provides its service to researchers within linguistics and other disciplines, the development of the infrastructure was driven by diverse usage scenarios and user needs on the one hand, and by the common technical requirements for certified service centres of the CLARIN infrastructure on the other. Beyond the technical details, the article also aims to be a contribution to the discussion on responsibilities and services within emerging digital research data infrastructures and the fundamental issues in sustainability of research software engineering, concluding that in order to truly cater to user needs across the research data lifecycle, we still need to bridge the gap between discipline-specific research methods in the process of digitalisation and generic digital research data management approaches.

Highlights

Over the last few decades, the development of digital practices in the humanities and social sciences has accelerated and become more widespread, partly along with digitalisation of society in general and partly as a result of targeted funding
Centre on Multilingualism at the University of Hamburg, The Hamburg Centre for Language Corpora (HZSK)5 was founded with the aim to cater for the legacy of curated research data and software from the Special Research Centre and for the newly founded centre to become a part of emerging digital research infrastructures
Further along the research data lifecycle, we find the users who reuse the data provided by the depositors via the HZSK Repository, including both students writing term papers and experienced researchers performing secondary or complementary analyses with existing data

Summary

Introduction

Over the last few decades, the development of digital practices in the humanities and social sciences has accelerated and become more widespread, partly along with digitalisation of society in general and partly as a result of targeted funding. At the Hamburg Centre for Language Corpora at the University of Hamburg, digital infrastructure and related services have been developed focussing on spoken data in research on multilingualism, linguistic diversity and language documentation, i.e., often the kind of audio-visual annotated linguistic research data referred to as spoken or oral (language) corpora. These complex collection resources comprise various data types, entities and relations which pose a challenge for data modelling and handling. Contributing to the current discussion on reuse and citation of research data and the replicability of research in general, this contribution describes evolving methods for curation, publication and dissemination of complex resource types considering these aspects

Background and Basic Concepts

Resource Types

Usage Scenarios and Dissemination

The Infrastructure at the HZSK

Findable Data through Diverse Metadata and Fine-grained PIDs

Accessible Data through a Comprehensive AAI Solution

Interoperable Data through Standards and Open Formats

Reusable Data for Various Usage Scenarios

Developing Strategies for Data Curation and Publication

Static and Dynamic Digital Language Resources

Efficient and Transparent Data Curation Workflows

Distributing Highly Specific Data via Generic Repositories—FAIR Enough?

Discussion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Providing Digital Infrastructure for Audio-Visual Linguistic Research Data with Diverse Usage Scenarios: Lessons Learnt

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Publications

Lead the way for us

Similar Papers

Conceptualizing an Open Map Repository as Part of a Planetary Research Data Infrastructure
Andrea Nass ... Sebastien Besse
-
Andrea Nass, et. al.Andrea Nass ... Sebastien Besse
02 May 2024
02 May 2024

Generativity in digital infrastructures
Kalle Lyytinen ... David Tilson
-
Kalle Lyytinen, et. al.Kalle Lyytinen ... David Tilson
15 Aug 2017
15 Aug 2017

ASSESSMENT ON THE FACTORS INFLUENCING INEFFECTIVE COMMUNICATION AMONG STAKEHOLDERS IN INFRASTRUCTURE DEVELOPMENT
Wan Norizan Wan Ismail ... Siti Sarah Mat Isa
PLANNING MALAYSIA | VOL. 22
Wan Norizan Wan Ismail, et. al.Wan Norizan Wan Ismail ... Siti Sarah Mat Isa
31 May 2024
PLANNING MALAYSIA | VOL. 22

Privacy protection throughout the research data life cycle
Live Håndlykken Kvale ... Peter Darch
Information Research: an international electronic journal | VOL. 27
Live Håndlykken Kvale, et. al.Live Håndlykken Kvale ... Peter Darch
15 Sep 2022
Information Research: an international electronic journal | VOL. 27

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Providing Digital Infrastructure for Audio-Visual Linguistic Research Data with Diverse Usage Scenarios: Lessons Learnt

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Publications