Abstract

AbstractThis paper proposes a method for extraction of the spectral envelope by a spectral evaluation criterion with nonuniform weighting on a log magnitude scale. In the traditional cepstral method, the spectral envelope is obtained by linear smoothing on the log scale. Consequently, the obtained log magnitude spectrum is affected by the low‐level portion of the spectral fine structure, one of the factors that has degraded the quality of the synthesized speech and the speech recognition rate. to avoid such a situation, the proposed method weights the peaks of the fine spectral structure more heavily than the valleys in the error criterion. Consequently, the resultant spectral envelope fits the peaks, i.e., the sampling points of the true spectral envelope for a voiced sound. This method does not require pitch extraction. In addition, the spectral envelope can be obtained stably both for voiced and unvoiced sounds. Thus, it is suitable for the automatic analysis of speech. the effectiveness of the method is demonstrated by some examples, and its relation to other methods, especially to the improved cepstral method, is discussed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call