Arabic Speaker Recognition: Babylon Levantine Subset Case Study

R R

doi:10.3844/jcssp.2010.381.385

Abstract

Problem statement: Researchers on Arabic speaker recognition have used local data bases unavailable to the public. In this study we would like to investigate Arabic speaker recognition using a publically available database, namely Babylon Levantine available from the Linguistic Data Consortium (LDC). Approach: Among the different methods for speaker recognition we focus on Hidden Markov Models (HMM). We studied the effect of both the parameters of the HMM models and the size of the speech features on the recognition rate. Results: To accomplish this study, we divided the database into small and medium size datasets. For each subset, we found the effect of the system parameters on the recognition rate. The parameters we varied the number of HMM states, the number of Gaussian mixtures per state, and the number of speech features coefficients. From the results, we found that in general, the recognition rate increases with the increase in the number of mixtures, till it reaches a saturation level which depends on the data size and the number of HMM states. Conclusion/Recommendations: The effect of the number of state depends on the data size. For small data, low number of states has higher recognition rate. For larger data, the number of states has very small effect at low number of mixtures and negligible effect at high number of mixtures.

Highlights

The literature on Arabic speaker recognition systems has a good number of researches, though very low compared to English language
In this study we focus on the Babylon dataset (BBL), which is available from the Linguistic Data
The authors mentioned that the best result of speaker identification was obtained with a 2 states single mixture Hidden Markov Models (HMM)

Summary

Introduction

The literature on Arabic speaker recognition systems has a good number of researches, though very low compared to English language. Among those researches, very few worked on some well known datasets. (unavailable for extended or further research by other groups), containing some digits, or some primitive words, just enough to say it is a local dataset. This makes it hard to compare the different systems and their results. In this study we focus on the Babylon dataset (BBL), which is available from the Linguistic Data Markov Models/Gaussian Mixtures.

Methods

Results

Discussion

Conclusion