BackgroundSchizophrenia (SCZ) has been associated to distinctive voice since its first definitions. Distinctive voice patterns are often associated with core negative symptoms and with social impairment. They may thus represent markers of the disorder. A recent meta-analysis identified weak atypicalities for pitch variability, and stronger atypicalities in duration (speech percentage, pause duration and speech rate). However, heterogeneity across studies was large, most of the studies underpowered (small sample and no repeated measures) and replications across studies almost nonexistent. In addition, there is a lack of cross-linguistic studies comparing voice and linguistic patterns in SCZ across different languages to assess whether the patterns are distinctive of SCZ in general, or specific to linguistic and/or cultural groups. In the present study, we aim to advance the understanding of voice patterns in SCZ by collecting and analyzing a cross-linguistic corpus of repeated voice measures. Such corpus enables us to systematically assess the replicability of previous meta-analytic results, better accounting for between and within participant variability, as well as cross-linguistic differences.MethodsWe collected a Danish (DK), Chinese (CH) and Japanese (JP) cross-linguistic dataset involving 163 participants with SCZ (105 DK, 51 CH, 7 JP) and 173 matched controls (HC) (117 DK, 43 CH, 13 JP) for a total of 3851 audio-recordings. Data were collected using the Animated Triangle 1 2020 Congress of the Schizophrenia International Research Society Task. Voice recordings were preprocessed using consolidated algorithms (Covarep, Praat) to extract the following features, in order to compare results with the effect sizes (ES) of previous meta-analysis (MA): 1) Duration measures (speech rate, duration of utterance, number of pauses, pause duration), as well as 2) pitch and intensity (mean and variability). To investigate differences between SCZ and HC, we ran multilevel regression models with the acoustic feature as outcome, diagnosis (SZ, HC) and language (DK, JP, CH) as predictors, and varying effects by participant and corpus. Predictors were scaled in order to allow comparison with meta-analysis ES.ResultsWe were only able to partially replicate previous findings. The meta-analysis found: 1) lower pitch variability, replicated for JP only (β= -1.25, SE = 0.37, p < .001); 2) lower speech rate replicated for DK only (β= -0.23, SE = .08, p < .01); 3) increased pause duration replicated for DK (β= 0.29, SE = .08, p < .001) and JP (β= 0.59, SE = .30, p < .05); 4) lack of evidence for atypical number of pauses replicated for DK, JP and CH; 5) lack of evidence for atypical duration of utterance replicated for CH and JP (DK presented higher duration: β= 0.01, SE = 0.01, p < .01); 6) lower proportion of spoken time, not replicated; 7) lack of evidence for pitch mean, replicated for DK, but higher in CH (β= 0.37, SE = .18, p < .05), and lower in JP (β= -1.46, SE = .41, p < .001).DiscussionWe found only partial replication of previous meta-analytic findings for reduced pitch variability, increased pause duration and lower speech rate, with ES generally smaller than in previous meta-analysis. On the contrary, we were not able to replicate previous findings of lower proportion of spoken time. Estimations of ES were largely affected by different languages, and replications held only for specific languages (pitch variability for JP, speech rate for DK, and pause duration for DK and JP). This indicates the important role that linguistic factors may play in originating vocal patterns in SCZ. Voice patterns seem not to be distinctive of SCZ in general, but bounded to linguistic/cultural differences. Future studies should better investigate how different acoustic and linguistic features interact in originating atypical voice patterns in SCZ.
Read full abstract