The terms "soprano" and "mezzo-soprano" are frequently used by vocal pedagogues to describe a main category of singing timbre categorization, while the terms "lyric" and "dramatic" are often used to describe sub-categories of "soprano" and "mezzo-soprano". A handful of studies have reported on the perceptual dissimilarity of main voice categories, but few, if any, have focused on within voice category perceptual distinctions such as dramatic and lyric vocal timbre. Using stimuli collected from cisgender female singers of varying voice categories and voice weights across the pitches C4, G4, and F5, this study sought (1) to visualize an experienced listener's perception of vocal timbre dissimilarity within and between voice categories using the statistical technique of multidimensional scaling (MDS), (2) to identify salient acoustic predictors of voice category and voice weight, and (3) to determine any dependencies on pitch for the perception of vocal timbre. For the pitches C4, G4, and F5, experienced listeners (N=18) rated the dissimilarity of pairs of sung vowels produced by classically trained singers classified as follows: six mezzo-sopranos (three lighter and three heavier) and six sopranos (three lighter and three heavier). The resulting dissimilarity data were analyzed using MDS. Backward linear regression was used to see if one or more of the following variables predicted MDS dimensions: spectral centroid from 0 to 5 kHz, spectral centroid from 0 to 2 kHz, spectral centroid from 2 to 5 kHz, frequency vibrato rate, and frequency vibrato extent. Listeners also completed a categorization task where they rated each individual stimulus on two dimensions: voice category and voice weight. Visual analysis of the MDS solutions appears to show that both voice category and voice weight emerged as dimensions at pitches C4 and G4. Discriminant analysis, on the other hand, statistically confirmed both these dimensions at G4, but only voice weight at C4. At pitch F5, only voice weight emerged as a dimension, both visually and statistically. Acoustic predictors of MDS dimensions were highly variable across pitches. At the pitch C4, no MDS dimension was predicted by the acoustic variables. At pitch G4, the dimension associated with voice weight was predicted by spectral centroid from 0 to 2 kHz. At pitch F5, the dimension associated with voice weight was predicted by spectral centroid from 2 to 5 kHz and frequency vibrato rate. In the categorization task, voice category and voice weight were highly correlated at the pitches C4, G4, and when all pitches were presented together, but weakly correlated at the pitch F5. While voice category and sub-category distinctions are commonly used by singing voice professionals to describe the overall timbre of voices, these distinctions may not be able to consistently predict the perceptual difference between any given pair of vocal stimuli, particularly across pitch. Nonetheless, these dimensions do emerge in some fashion when listeners are presented with paired vocal stimuli. On the other hand, when asked to rate stimuli according to the specific labels of mezzo-soprano/soprano and dramatic/lyric, experienced listeners have a very difficult time disentangling voice category from voice weight when presented with a single-note stimulus or even a 3-note stimulus consisting of the pitches C3, G4, and F5.
Read full abstract