Speech Duration Research Articles

Previous research has found that human voice can provide reliable information to be used for gender identification with a high level of accuracy. In social psychology, perceived masculinity and femininity (masculinity and femininity rated by humans) has often been considered an important feature when investigating the influence of vocal features on social behaviours. While previous studies have characterized the acoustic features that contributed to perceivers’ judgements of speakers’ masculinity or femininity, there is limited research on developing a machine masculinity/femininity scoring model and characterizing the independent acoustic factors that contribute to perceivers’ masculinity and femininity judgements. In this work, we first propose a machine scoring model of perceived masculinity/femininity based on the Extreme Random Forest and then characterize the independent and meaningful acoustic factors that contribute to perceivers’ judgements by using a correlation matrix based hierarchical clustering method. Our results show that the machine ratings of masculinity and femininity strongly correlated with the human ratings of masculinity and femininity when we used an optimal speech duration of 7 s, with a correlation coefficient of up to .63 for females and .77 for males. Nine independent clusters of acoustic measures were generated from our modelling of femininity judgements for female voices and eight clusters were found for masculinity judgements for male voices. The results revealed that, for both genders, the F0 mean is the most important acoustic measure affecting the judgement of acoustic-related masculinity and femininity. The F3 mean, F4 mean and VTL estimators were found to be highly inter-correlated and appeared in the same cluster, forming the second most significant factor in influencing the assessment of acoustic-related masculinity and femininity. Next, F1 mean, F2 mean and F0 standard deviation are independent factors that share similar importance. The voice perturbation measures, including HNR, jitter and shimmer, are of lesser importance in influencing masculinity/femininity judgements.

We compared digital speech and language features of patients with amnestic Alzheimer disease (aAD) or logopenic variant primary progressive aphasia (lvPPA) in a biologically confirmed cohort and related these features to neuropsychiatric test scores and CSF analytes. We included patients with aAD or lvPPA with CSF (phosphorylated tau ([p-tau]/β-amyloid [Aβ] ≥0.09, and total tau/Aβ ≥0.34) or autopsy confirmation of AD pathology and age-matched healthy controls (HC) recruited at the Frontotemporal Degeneration Center of the University of Pennsylvania for a cross-sectional study. We extracted speech and language variables with automated lexical and acoustic pipelines from participants' oral picture descriptions. We compared the groups and correlated distinct features with clinical ratings and CSF p-tau levels. We examined patients with aAD (n = 44; age 62 ± 8 years; 24 women; Mini-Mental State Examination [MMSE] score 21.1 ± 4.8) or lvPPA (n = 21; age 64.1 ± 8.2 years; 11 women; MMSE score 23.0 ± 4.2) and HC (n = 28; age 65.9 ± 5.9 years, 15 women; MMSE score 29 ± 1). Patients with lvPPA produced fewer verbs (10.5 ± 2.3; p = 0.001) and adjectives (2.7 ± 1.3, p = 0.019) and more fillers (7.4 ± 3.9; p = 0.022) with lower lexical diversity (0.84 ± 0.1; p = 0.05) and higher pause rate (54.2 ± 19.2; p = 0.015) than individuals with aAD (verbs 12.5 ± 2; adjectives 3.8 ± 2; fillers 4.9 ± 4.5; lexical diversity 0.87 ± 0.1; pause rate 45.3 ± 12.8). Both groups showed some shared language impairments compared with HC. Word frequency (MMSE score: β = -1.6, p = 0.009; Boston Naming Test [BNT] score: β = -4.36, p < 0.001), adverbs (MMSE score: β = -1.9, p = 0.003; BNT score: β = -2.41, p = 0.041), pause rate (MMSE score: β = -1.21, p = 0.041; BNT score: β = -2.09, p = 0.041), and word length (MMSE score: β = 1.75, p = 0.001; BNT score: β = 2.94, p = 0.003) were significantly correlated with both MMSE and BNT scores, but other measures were not correlated with MMSE and/or BNT score. Prepositions (r = -0.36, p = 0.019), nouns (r = -0.31, p = 0.047), speech segment duration (r = -0.33, p = 0.032), word frequency (r = 0.33, p = 0.036), and pause rate (r = 0.34, p = 0.026) were correlated with patients' CSF p-tau levels. Our measures captured language and speech differences between the 2 phenotypes that traditional language-based clinical assessments failed to identify. This work demonstrates the potential of natural speech in reflecting underlying variants with AD pathology.

Speech Duration Research Articles

Related Topics

Articles published on Speech Duration

Acoustic characterization and machine prediction of perceived masculinity and femininity in adults

Functional timing or rhythmical timing, or both? A corpus study of English and Mandarin duration.

Developmental consistency in the use of subphonemic information during real-time sentence processing

Mandarin lexical tone duration: Impact of speech style, word length, syllable position and prosodic position

Spoken language identification on 4 Indonesian local languages using deep learning

Association between lateral wall electrode array insertion parameters and audiological outcomes in bilateral cochlear implantation

Understanding and characterizing speaker roles within naturalistic task-based communications: The fearless steps APOLLO-11 corpus

Efficient Personalized Speech Enhancement Through Self-Supervised Learning

Extraction of indexical and linguistic information as a function of duration in the older population

Getting the message in ‘Sound’ across at conference interpreting: a case study on rendering prosodic emphasis

Do Voice-Based Judgments of Socially Relevant Speaker Traits Differ Across Speech Types?

Complex speech-language therapy interventions for stroke-related aphasia: the RELEASE study incorporating a systematic review and individual participant data network meta-analysis

Rhetoric and prosody of judicial discourse in mass media

Rhetoric and prosody of judicial discourse in mass media

The Sabancı University Dynamic Face Database (SUDFace): Development and validation of an audiovisual stimulus set of recited and free speeches with neutral facial expressions.

Spaniards articulate faster than Mexicans

Maternal speech to singleton and first-born dizygotic twin infants: a four-month longitudinal and naturalistic study

Lexical and Acoustic Speech Features Relating to Alzheimer Disease Pathology.

On the variation of fricative airflow dynamics with vocal tract geometry and speech loudness

Winning the second half: The perceived and actual impact of the coach's half-time speech on basketball players’ performance

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Speech Duration Research Articles

Related Topics

Articles published on Speech Duration

Acoustic characterization and machine prediction of perceived masculinity and femininity in adults

Functional timing or rhythmical timing, or both? A corpus study of English and Mandarin duration.

Developmental consistency in the use of subphonemic information during real-time sentence processing

Mandarin lexical tone duration: Impact of speech style, word length, syllable position and prosodic position

Spoken language identification on 4 Indonesian local languages using deep learning

Association between lateral wall electrode array insertion parameters and audiological outcomes in bilateral cochlear implantation

Understanding and characterizing speaker roles within naturalistic task-based communications: The fearless steps APOLLO-11 corpus

Efficient Personalized Speech Enhancement Through Self-Supervised Learning

Extraction of indexical and linguistic information as a function of duration in the older population

Getting the message in ‘Sound’ across at conference interpreting: a case study on rendering prosodic emphasis

Do Voice-Based Judgments of Socially Relevant Speaker Traits Differ Across Speech Types?

Complex speech-language therapy interventions for stroke-related aphasia: the RELEASE study incorporating a systematic review and individual participant data network meta-analysis

Rhetoric and prosody of judicial discourse in mass media

Rhetoric and prosody of judicial discourse in mass media

The Sabancı University Dynamic Face Database (SUDFace): Development and validation of an audiovisual stimulus set of recited and free speeches with neutral facial expressions.

Spaniards articulate faster than Mexicans

Maternal speech to singleton and first-born dizygotic twin infants: a four-month longitudinal and naturalistic study

Lexical and Acoustic Speech Features Relating to Alzheimer Disease Pathology.

On the variation of fricative airflow dynamics with vocal tract geometry and speech loudness

Winning the second half: The perceived and actual impact of the coach's half-time speech on basketball players’ performance