The rise of artificial intelligence has prompted increased scrutiny of systemic biases in automatic speech recognition technologies. One focal topic of discussion has been the degraded performance for speakers of African American and Southern U.S. English. This study aims to contribute to the research on bias in voice-AI by investigating speech recognition performance for Appalachian English, an often-stigmatized variety in American society. Participants were recruited from Southern Appalachia (Eastern Tennessee), with a non-Southern Appalachian (Central Pennsylvania) sample included as a reference group. The participants read aloud the Goldilocks fairytale and the Rainbow Passage, and the recordings were processed using Dartmouth Linguistic Automation (DARLA). We conducted two sets of analyses on the vowel phonemes. The first analysis assessed DARLA’s effectiveness in recognizing vowels. The system returned higher phoneme error rates for Southern Appalachian speech compared to the non-Southern dataset. Next, we conducted a detailed error analysis on the misrecognized input-output phoneme pairs. The results suggested dialect bias in the system, with 50.2% of the errors in the Southern dataset attributed to participation in the Southern Vowel Shift. These findings underscore the importance of integrating sociolectal variation into the acoustic model to mitigate dialect bias for currently underserved users.
Read full abstract