AbstractThe spectral envelope of speech can be represented efficiently by the log magnitude spectrum on the nonlinear frequency scale, which is close to mel scale (called mel‐log spectrum). the mel cepstrum defined by its Fourier coefficients is also considered to have a suitable property as the parameter to represent the spectral envelope. So far, no satisfactory filter has been reported for the synthesis approximating the mel‐log spectrum. This paper presents a method of constructing the mel‐log spectrum approximation (MLSA) filter, which has a relatively simple structure and a low coefficient sensitivity, together with a design example of MLSA filter for speech synthesis. the transfer function of MLSA filter is represented by Padé approximation, which approximates the exponential of the transfer function of the filter (so‐called basic filter). Since the transfer function of the basic filter is represented by a polynomial with the transfer function of the first‐order all‐pass filter as the variable, it is necessary in the realization of the filter to delete from the feedback loop the path without a delay. By the construction method of MLSA filter shown in this paper, the path without delay can easily be deleted from the feedback loop in the MLSA filter. the obtained MLSA filter is of relatively simple structure and has low coefficient sensitivity. the quantization characteristics of the coefficient are also satisfactory.
Read full abstract