The sociolinguistic foundations of language modeling

Jack Grieve,Sara Bartl,Matteo Fuoli,Jason Grafmiller,Weihang Huang,Alejandro Jawerbaum,Akira Murakami,Marcus Perlman,Dana Roemling,Bodo Winter

doi:10.3389/frai.2024.1472411

Jack Grieve, Sara Bartl + Show 8 more

Open Access

https://doi.org/10.3389/frai.2024.1472411

Copy DOI

Export

Save

Cite

Journal: Frontiers in Artificial Intelligence	Publication Date: Jan 13, 2025
License type: CC BY 4.0

Abstract
Full-Text
Similar Papers

Abstract

Listen

In this article, we introduce a sociolinguistic perspective on language modeling. We claim that language models in general are inherently modeling varieties of language, and we consider how this insight can inform the development and deployment of language models. We begin by presenting a technical definition of the concept of a variety of language as developed in sociolinguistics. We then discuss how this perspective could help us better understand five basic challenges in language modeling: social bias, domain adaptation, alignment, language change, and scale. We argue that to maximize the performance and societal value of language models it is important to carefully compile training corpora that accurately represent the specific varieties of language being modeled, drawing on theories, methods, and descriptions from the field of sociolinguistics.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

The sociolinguistic foundations of language modeling

Abstract

Published Version

Talk to us

Similar Papers

More From: Frontiers in Artificial Intelligence

Lead the way for us

Similar Papers

Products, developers, and milestones: how should I build my N-Gram language model
Juliana Saraiva ... Thomas Zimmermann
-
Juliana Saraiva, et. al.Juliana Saraiva ... Thomas Zimmermann
30 Aug 2015
30 Aug 2015

Language Model Adaptation Using Dirichlet Class Language Model Based on Part-of-Speech
...
-
, et. al. ...
21 Mar 2014
21 Mar 2014

Integrating meta-information into recurrent neural network language models
Yangyang Shi ... Kris Demuynck
Speech Communication | VOL. 73
Yangyang Shi, et. al.Yangyang Shi ... Kris Demuynck
24 Jun 2015
Speech Communication | VOL. 73

A Unified Framework for Feature-based Domain Adaptation of Neural Network Language Models
Michael Hentschel ... Tomoharu Iwata
-
Michael Hentschel, et. al.Michael Hentschel ... Tomoharu Iwata
01 May 2019
01 May 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

The sociolinguistic foundations of language modeling

Abstract

Published Version

Talk to us

Similar Papers

More From: Frontiers in Artificial Intelligence