Vocal Tract Area Research Articles

Timbre is a central quality of singing, yet remains a complex notion poorly understood in psychoacoustic studies. Previous studies note how no single acoustic variable or combinations of variables consistently predict timbre dimensions. Timbre varies on a continuum from darkest to lightest. These extremes are associated with laryngeal and vocal tract adjustments related to smaller and larger vocal tract area and variations in vocal fold vibratory characteristics. Perceptually, timbre assessment is influenced by spectral characteristics and formant frequency adjustments, though these dimensions are not independently perceived. Perceptual studies repeatedly demonstrate difficulties in correlating variations in timbre stimuli to specific measures. A recent study demonstrated how acoustic predictive salience of voice category and voice weight across pitches contribute to timbre assessments and concludes that timbre may be related to as-of-yet unknown factor(s). The purpose of this study was to test four different models for assessing timbre; one model focused on specific anatomy, one on listener intuition, one utilising auditory anchors, and one using expert raters in a deconstructed timbre model with 5 specific dimensions. MethodsFour independent panels were conducted with separate cohorts of professional singing teachers. 41 assessors took part in the anatomically focused panel, 54 in the intuition-based panel, 30 in the anchored panel, and 12 in the expert listener panel. Stimuli taken from live performances of well-known singers were used for all panels, representing all genders, genres, and styles across a large pitch range. All stimuli are available as supplementary materials. Fleiss’ kappa values, descriptive statistics, and significance tests are reported for all panel assessments. ResultsPanel 1 through 4 varied in overall accuracy and agreement. The intuitionbased model showed overall 45% average accuracy (SD ±4%), k=0.289 (<0.001) compared to overall 71% average accuracy (SD ±3%), k=0.368 (<0,001) of the anatomical focused panel. The auditory-anchored model showed overall 75% average accuracy (SD ±8%), k=0.54 (<0.001) compared with overall 83% average accuracy and agreement of k=0.63 (<0.001) for panel 4. Results revealed that highest accuracy and reliability was achieved in a deconstructed timbre model and that providing anchoring improved reliability but with no further increase in accuracy. ConclusionDeconstructing timbre into specific parameters improved auditory perceptual accuracy and overall agreement. Assessing timbre along with other perceptual dimensions improves accruacy and reliability. Panel assessors’ expert level of listening skills remain an important factor in obtaining reliable and accurate assessments of auditory stimuli for timbre dimensions. Anchoring improved reliability but with no further increase in accuracy. The study suggests that timbre assessment can be improved by approaching the percept through a prism of 5 specific dimensions each related to specific physiology and auditory-perceptual subcategories. Further tests are needed with framework-naïve listeners, non-musically educated listeners, artificial intelligence comparisons, and synthetic stimuli to further test the reliability.

Read full abstract

The pharyngeal fricative is a typical compensatory articulation disorder in cleft palate speech. It is produced by retracting the root of the tongue to the posterior pharyngeal wall to substitute for the fricatives and affricates produced in the oral cavity. People who use the pharyngeal fricative have difficulties in daily communication. Research on automatic pharyngeal fricative detection can provide aids in diagnosis for speech-language pathologists and clinical doctors. This work proposes a vocal tract area spectrum (VTAS) to represent a vocal tract model using time-varying cascaded pipes. Four acoustic features based on the VTAS (the centroid and spread (CS), peak linear deviation (PLD), relative-normal entropy (RNE), mean of the ratios’ statistics (MRS)) are proposed to evaluate the differences between pharyngeal fricatives and normal speech. The CS feature is proposed to evaluate the overall shape of the vocal tract to detect whether there are abnormal gestures or movements of the articulators in speech production. The PLD and RNE features focus on the variation and complexity of each vocal tube's area during the whole pronunciation process. The MRS feature is proposed to describe the continuity of the vocal tract. To evaluate the effectiveness of these four features, pharyngeal fricative detection experiments are conducted using a pharyngeal fricative dataset. This dataset contains 1246 speech samples spoken by 50 cleft palate patients and 50 normal speakers, covering all types of initial consonants in which the pharyngeal fricative usually occurs. The detection accuracy of the pharyngeal fricative using the CS, PLD, RNE and MRS feature ranges from 80.66% to 90.21%. When using the proposed CS+PLD+RNE+MRS feature, an accuracy of 95.18% can be achieved on the pharyngeal fricative dataset.

Read full abstract

Vocal Tract Area Research Articles

Related Topics

Articles published on Vocal Tract Area

Towards Improved Auditory-Perceptual Assessment of Timbres: Comparing Accuracy and Reliability of Four Deconstructed Timbre Assessment Models

Sketches of chimpanzee (Pan troglodytes) hoo’s: vowels by any other name?

Bora's high vowels involve a two-way dental contrast, not a three-way backness contrast

Parametrization of the vocal tract area function using a subset selection approach (L).

An inverse problem to determine the shape of a human vocal tract

Automatic detection of pharyngeal fricatives in cleft palate speech using acoustic features based on the vocal tract area spectrum

Electrical Modeling of Two Tube Vocal Tract for Voice Pathology Detection

Detection & Classification of Voice Pathology using Electrical Circuit Parameters

Articulatory-acoustic relations in the production of alveolar and palatal lateral sounds in Brazilian Portuguese

Optimal prosodic feature extraction and classification in parametric excitation source information for Indian language identification using neural network based Q-learning algorithm

Discrimination of “hot potato voice” caused by upper airway obstruction utilizing a support vector machine

Automatic Detection Algorithm of Pharyngeal Fricatives in Cleft Palate Speech Based on LPIF and Feature Selection

3D dynamic MRI of the vocal tract during natural speech.

Construction and Evaluation of a Parametric One-Dimensional Vocal Tract Model

An age-dependent vocal tract model for males and females based on anatomic measurements.

Determining the shape of a human vocal tract from pressure measurements at the lips

Articulatory correlates of phonemic and coarticulatory nasalization

Development of Visual Support Way for Pronunciation Practice Using Reflection Coefficients of Burg's Method

An acoustically-driven vocal tract model for stop consonant production

Robust phoneme classification for automatic speech recognition using hybrid features and an amalgamated learning model

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Vocal Tract Area Research Articles

Related Topics

Articles published on Vocal Tract Area

Towards Improved Auditory-Perceptual Assessment of Timbres: Comparing Accuracy and Reliability of Four Deconstructed Timbre Assessment Models

Sketches of chimpanzee (Pan troglodytes) hoo’s: vowels by any other name?

Bora's high vowels involve a two-way dental contrast, not a three-way backness contrast

Parametrization of the vocal tract area function using a subset selection approach (L).

An inverse problem to determine the shape of a human vocal tract

Automatic detection of pharyngeal fricatives in cleft palate speech using acoustic features based on the vocal tract area spectrum

Electrical Modeling of Two Tube Vocal Tract for Voice Pathology Detection

Detection &amp; Classification of Voice Pathology using Electrical Circuit Parameters

Articulatory-acoustic relations in the production of alveolar and palatal lateral sounds in Brazilian Portuguese

Optimal prosodic feature extraction and classification in parametric excitation source information for Indian language identification using neural network based Q-learning algorithm

Discrimination of “hot potato voice” caused by upper airway obstruction utilizing a support vector machine

Automatic Detection Algorithm of Pharyngeal Fricatives in Cleft Palate Speech Based on LPIF and Feature Selection

3D dynamic MRI of the vocal tract during natural speech.

Construction and Evaluation of a Parametric One-Dimensional Vocal Tract Model

An age-dependent vocal tract model for males and females based on anatomic measurements.

Determining the shape of a human vocal tract from pressure measurements at the lips

Articulatory correlates of phonemic and coarticulatory nasalization

Development of Visual Support Way for Pronunciation Practice Using Reflection Coefficients of Burg's Method

An acoustically-driven vocal tract model for stop consonant production

Robust phoneme classification for automatic speech recognition using hybrid features and an amalgamated learning model

Detection & Classification of Voice Pathology using Electrical Circuit Parameters