Abstract
Glottocodes constitute the backbone identification system for the language, dialect and family inventory Glottolog (https://glottolog.org). In this paper, we summarize the motivation and history behind the system of glottocodes and describe the principles and practices of data curation, technical infrastructure and update/version-tracking systematics. Since our understanding of the target domain – the dialects, languages and language families of the entire world – is continually evolving, changes and updates are relatively common. The resulting data is assessed in terms of the FAIR (Findable, Accessible, Interoperable, Reusable) Guiding Principles for scientific data management and stewardship. As such the glottocode-system responds to an important challenge in the realm of Linguistic Linked Data with numerous NLP applications.
Highlights
CO Glottocodes constitute the backbone identification system for the language, dialect and family inventory Glottolog
The resulting data is assessed in terms of the FAIR (Findable, Accessible, Interoperable, T Reusable) Guiding Principles for scientific data management and stewardship. As such the glottocode-system responds to an important challenge in the realm of Linguistic Linked Data with numerous NLP applications
A glottocode consists of four alphanumeric characters and four decimal digits, for example abcd1234 or b10b1234
Summary
CO Glottocodes constitute the backbone identification system for the language, dialect and family inventory Glottolog (https://glottolog.org, currently in edition 4.4, [14]). A glottocode consists of four alphanumeric characters (i.e., lowercase letters or decimal digits) and four decimal digits, for example abcd1234 or b10b1234. Glottocodes are complementary to three-letter ISO 639-3 language identification codes (see https://iso639-3.sil.org/) which, concern languages only. There are 25,900 glottocodes (8,533 language-level, 4,571 family-level and 12,796 dialectlevel). Hammarström / Glottocodes: Identifiers linking families, languages and dialects
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.