Lexical characteristics of speech stimuli can significantly impact intelligibility. However, lexical characteristics of the widely used Speech Intelligibility Test (SIT) are unknown. We aimed to (a) define variation in neighborhood density, word frequency, grammatical word class, and type-token ratio across a large corpus of SIT sentences and tests and (b) determine the relationship of lexical characteristics to speech intelligibility in speakers with multiple sclerosis (MS), Parkinson's disease (PD), and neurologically healthy controls. Using an extant database of 92 speakers (32 controls, 30 speakers with MS, and 30 speakers with PD), percent correct intelligibility scores were obtained for the SIT. Neighborhood density, word frequency, word class, and type-token ratio were calculated and summed for each of the 11 sentences of each SIT test. The distribution of each characteristic across SIT sentences and tests was examined. Linear mixed-effects models were performed to assess the relationship between intelligibility and the lexical characteristics. There was large variability in the distribution of lexical characteristics across this large corpus of SIT sentences and tests. Modeling revealed a relationship between intelligibility and the lexical characteristics, with word frequency and word class significantly contributing to the model. Three primary findings emerged: (a) There was considerable variability in lexical characteristics both within and across the large corpus of SIT tests; (b) there was not a robust association between intelligibility and the lexical characteristics; and (c) findings from a study demonstrating an effect of neighborhood density and word frequency on intelligibility were replicated. Clinical and research implications of the findings are discussed, and three exemplar SIT tests systematically controlling for neighborhood density and word frequency are provided.
Read full abstract