Optimisation of training samples in recognition of overlapping speech and identification of speaker in a two speakers situation

C Lingam,S Shanthi Therese

doi:10.1504/ijaip.2020.10028516

Abstract

Recognition of overlapping speech is still a challenging problem in the area of automatic speech recognition (ASR). In this paper, we have proposed a technique for overlapping speech recognition integrated with firefly optimization technique. Overlapped segments are thoroughly analysed for different dominant frequencies involved in the mixture. We have created an audio splitting function. Split audio signals are converted into mel cepstral coefficients and the intensity variations of signals are indicated by their cepstrum. Phoneme density updated cepstrum (PDUC) features are extracted from both spectrum feature analysis and mel frequency cepstral coefficients (MFCC) coefficients. Further, firefly optimization technique is used for clustering and selecting best relevant features. Datasets of speech separation challenge (SSC) Scopus are used to evaluate the results. From the results, we could conclude that minimum samples of 20% to 30% are sufficient to achieve recognition accuracy of above 90%.

Full Text