This paper addresses the issue of local disturbances in the fundamental frequency contour of speech, caused by the articulation of voiced/unvoiced consonant phonemes. Depending on the intended use of the F0 contour, these disturbances are usually eliminated by a filtering, smoothing or stylization procedure. These procedures that seek to preserve only the F0 points perceptually relevant, are generally applied roughly at a global level, which may not completely eliminate micro intonation in some cases or distort macro intonation in others. In this work we propose a local filtering algorithm based on a fine level analysis of the microprosodic morphologies. The performance of the algorithm is validated by a perceptual experiment. Assuming the algorithm allows partial/total disturbance elimination, we perform a statistical description of the perturbation morphologies. Statistics were collected from a corpus of 741 sentences designed to study Argentine Spanish prosody. The corpus was recorded by four professional announcers native speakers from Buenos Aires city. The results show that perturbation morphologies are affected by: consonant phoneme identity; global F0 contour shape; and speaker identity. As an application case, we use the proposed filtering algorithm as a pre-processing stage in our automatic prominent syllable detection system, with a statistically significant improvement in its performance.
Read full abstract