Abstract
There is substantial interest in deploying voice biomarkers for the detection of alcohol intoxication, yet it remains unknown how other mental and physical states (e.g. emotion, stress) influence voice biomarkers in intoxicated speech. We compared measurements of voice quality (jitter, shimmer, noise-harmonics ratio) across two datasets of alcohol-intoxicated speech: (1) The alcohol language corpus, which contains laboratory elicited speech from 167 individuals in both sober and intoxicated conditions and (2) a custom dataset of police (control, n = 14) and suspect (intoxicated, n = 32) interactions during traffic stops where intoxication was verified via breath analysis. Measurements were extracted from all stressed vowel tokens and compared across conditions using two-sample t-tests within sex-specific groupings of each dataset. For both males and females, jitter was significantly lower during intoxication as measured from laboratory-elicited speech, but significantly higher for intoxicated individuals when measured from police–suspect interactions. These results suggest that voice biomarkers for alcohol-intoxication are easily confounded by other emotional and physical states (e.g., stress during police interaction), and thus, present a particular challenge for speaker-independent detection systems where baseline voice quality measurements across different emotional states are unknown. [Research funded by Tenvos Incorporated for the development of commercial speaker state-detection algorithms.]
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have