ПРИМЕНЕНИЕ ЧАСТОТНОГО МАСКИРОВАНИЯ ПРИ MFCC-ПАРАМЕТРИЗАЦИИ РЕЧИ НА ФОНЕ ШУМОВ

K K Tomchuk

doi:10.15217/issn1684-8853.2016.3.8

Abstract

Introduction: MFCCs are widely used in speech signals parameterization, however their effectiveness significantly decreases when the signal contains a noise term. The work proposes and studies a modification of the traditional MFCC calculation algorithm by introducing additional signal transformations based on the mechanisms of speech production and perception. Results: We propose a psychoacoustic model which allows you to take into account the frequency-masking effect in human auditory perception. In addition, considering how formant areas are formed in the voice spectrum, we propose to influence the spectral counts corresponding to multiple harmonics of the fundamental tone. The modified algorithm was tested in a single-word recognition system adapted for speech signal parameterization with only MFCCs. We demonstrated a positive effect of using the proposed additional speech signal transformations in the parameterization algorithm. Practical relevance: The proposed approach to MFCC calculation for the speech signal segment allows you to improve MFCC usage effectiveness in a variety of speech applications.

Full Text