Abstract

Crowdsourcing linguistic phenomena with smartphone applications is relatively new. In linguistics, apps have predominantly been developed to create pronunciation dictionaries, to train acoustic models, and to archive endangered languages. This paper presents the first account of how apps can be used to collect data suitable for documenting language change: we created an app, Dialäkt Äpp (DÄ), which predicts users’ dialects. For 16 linguistic variables, users select a dialectal variant from a drop-down menu. DÄ then geographically locates the user’s dialect by suggesting a list of communes where dialect variants most similar to their choices are used. Underlying this prediction are 16 maps from the historical Linguistic Atlas of German-speaking Switzerland, which documents the linguistic situation around 1950. Where users disagree with the prediction, they can indicate what they consider to be their dialect’s location. With this information, the 16 variables can be assessed for language change. Thanks to the playfulness of its functionality, DÄ has reached many users; our linguistic analyses are based on data from nearly 60,000 speakers. Results reveal a relative stability for phonetic variables, while lexical and morphological variables seem more prone to change. Crowdsourcing large amounts of dialect data with smartphone apps has the potential to complement existing data collection techniques and to provide evidence that traditional methods cannot, with normal resources, hope to gather. Nonetheless, it is important to emphasize a range of methodological caveats, including sparse knowledge of users’ linguistic backgrounds (users only indicate age, sex) and users’ self-declaration of their dialect. These are discussed and evaluated in detail here. Findings remain intriguing nevertheless: as a means of quality control, we report that traditional dialectological methods have revealed trends similar to those found by the app. This underlines the validity of the crowdsourcing method. We are presently extending DÄ architecture to other languages.

Highlights

  • Crowdsourcing, “the practice of obtaining needed [. . .] content by soliciting contributions from a large group of people and especially from the online community [. . .],” powerfully capitalizes on the fact that none of us is as smart as all of us [1]

  • Our results further reveal a possible scaling of variables in language change: phonetic variables seem to be less affected than lexical ones, a finding attested elsewhere [50]

  • We report that changes have taken place on all investigated linguistic levels: phonetic, lexical, and morphological

Read more

Summary

Introduction

Crowdsourcing, “the practice of obtaining needed [. . .] content by soliciting contributions from a large group of people and especially from the online community [. . .],” powerfully capitalizes on the fact that none of us is as smart as all of us [1]. . .] content by soliciting contributions from a large group of people and especially from the online community [. One of the first accounts of collecting dialect data in a crowdsourcing fashion was the German dialect survey conducted by Georg Wenker. Wenker began documenting dialects in the late 19th century by distributing some 50,000 questionnaires with 40 test sentences to schoolmasters across Germany, achieving a 90% response rate. The survey and responses were written in Standard German orthography as well as localized transcriptions and were collated, stored, and prepared for display on large paper maps [3]. A century and a half later, paper is being replaced by online surveys and smartphone applications (apps) as a very powerful and flexible medium for crowdsourcing language data

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.