Automatic speaker recognition with enhanced swallow swarm optimization and ensemble classification model from speech signals

Kharibam Jilenkumari Devi,Khelchandra Thongam

doi:10.1007/s12652-019-01414-y

Abstract

An Automatic Speaker Recognition (ASR) system which applies voice as a discriminatory feature is used for authentication and verification of identity of a person. A system which is able to deliver acceptable performance in an environment with multiple operating parameters is the requirement of a robust speaker recognition system. Enhancing the performance characteristics of the ASR becomes major significant and challenging issue. The key objective of this work is to introduce a new automatic speaker recognition system for speech signals. In this proposed work, preprocessing is done with the LMS adaptive filter. Features like zero-crossing rate, energy, auto-correlation function, Mel frequency cepstral coefficient features are extracted from noise removed signals to obtain better results. Using the extracted features, input samples are created and the dimensions have been reduced using enhanced swallow swarm optimization. Finally, ensemble classification model using many classification methods such as long short-term memory, improved convolutional neural network and support vector machine is used to perform speaker recognition. The proposed method has more potential to classify speech signals in comparisons with other models. The method indicates an accuracy of 95.69%. Proposed work adopts the external quality metrics accuracy, sensitivity and specificity to compare with the existing methods.

Full Text