Abstract

Many works have been done in the methods of improving performance by proposing new speech characteristics and new perception measurements. However, they only focus on one of the two aspects. In this paper, we try to study the relationship between them. That is, we discuss which acoustic features or their combinations are the most consistent with the real perception of Chinese initials. We propose a method that can measure the acoustic distance and keep it monotonically related to the perceptual distance of Chinese initials. We first define the acoustic distance and perceptual distance between different Chinese initials, and single out a proper combination of acoustic features and two compatible distance metrics by conducting clustering analysis on the samples of all types of Chinese initials using MFCC and PLP. Based on the data provided by the General Hospital of the People's Liberation Army, we then calculate the acoustic distance and perceptual distance. Finally, we calculate the Spearman's rho between two types of distance corresponding to the two calculation method. The experiment results show that there is a relatively high strength of monotonic relationship with the selected acoustic features between two types of distance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call