Features extraction is an important step in Automatic Speech Recognition, which consists of determining the audio signal components that are useful for identifying linguistic content while removing background noise and irrelevant information. The main objective of features extraction is to identify the discriminative and robust features in the acoustic data. The derived feature vector should possess the characteristics of low dimensionality, long-time stability, non-sensitivity to noise, and no correlation with other features, which makes the application of a robust feature extraction technique a significant challenge for Automatic Speech Recognition. Many comparative studies have been carried out to compare different speech recognition feature extraction techniques, but none of them have evaluated the criteria to be considered when applying a feature extraction technique. The objective of this work is to answer some of the questions that may arise when considering which feature extraction techniques to apply, through a multi-criteria comparison of different features extraction techniques using the Weighted Scoring Method.
Read full abstract