Abstract

This paper proposes a new collaborative and inclusive model for Knowledge Organization Systems (KOS) for sustaining cultural heritage and language diversity. It is based on contributions of end-users as well as scientific and scholarly communities from across borders, languages, nations, continents, and disciplines. It consists in collecting knowledge about all worldwide translations of one original work and sharing that data through a digital and interactive global knowledge map. Collected translations are processed in order to build multilingual parallel corpora for a large number of under-resourced languages as well as to highlight the transnational circulation of knowledge. Building such corpora is vital in preserving and expanding linguistic and traditional diversity. Our first experiment was conducted on the world-famous and well-traveled American novel Adventures of Huckleberry Finn by the American author Mark Twain. This paper reports on 10 parallel corpora that are now sentence-aligned pairs of English with Basque (an European under-resourced language), Bulgarian, Dutch, Finnish, German, Hungarian, Polish, Portuguese, Russian, and Ukrainian, processed out of 30 collected translations.

Highlights

  • The impact of the digital revolution on the preservation, organization, and sharing of human knowledge encoded by languages constitutes an extraordinarily rich phenomenon, characterized by both productive opportunities and obstacles and threats.In the first place, digital has created tremendous opportunities in terms of accessing knowledge.more people and especially persons belonging to minority communities can enjoy knowledge more quickly and cheaply

  • In an increasingly globalized context, multilingualism has become a major preoccupation for the field of Library and Information Science (LIS) and in particular for Knowledge Organization Systems which have to be as fair as possible [12,13,14] to ensure and sustain knowledge diversity

  • Unlike existing knowledge sharing models used by most digital libraries and collections, we propose a new interactive model allowing end-users and volunteer scholars to contribute and to share their knowledge about an original work through an interactive and online global knowledge map (Figure 3)

Read more

Summary

Introduction

The impact of the digital revolution on the preservation, organization, and sharing of human knowledge encoded by languages constitutes an extraordinarily rich phenomenon, characterized by both productive opportunities and obstacles and threats. 56,000 free written and audio eBooks and especially older works for which copyright has expired in more than 50 under-resourced languages Those ongoing projects have made and continue to make significant progress in the preservation of knowledge and language diversity. More than a century ago, Paul Otlet, the pioneer of Documentation Studies (known today as Library and Information Science or LIS), envisioned a universal compilation of knowledge and the technology to make it globally available. He wrote numerous essays on how to collect and organize the world’s knowledge, culminating in two books [1,10]. In an increasingly globalized context, multilingualism has become a major preoccupation for the field of LIS and in particular for Knowledge Organization Systems which have to be as fair as possible [12,13,14] to ensure and sustain knowledge diversity

Related Work
New Paradigm for a Sustainable Knowledge Organization Model
Why Focus on Translations?
Static Versus Interactive Knowledge Sharing Process
Experiments and Results
Data Curation
A Crowdsourcing Approach for Text Collection and Transcription
Data Alignment for Building Parallel Corpora
The Rosetta Dashboard for Fine-grained Knowledge Circulation Analysis
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.