Abstract

Speech recognition has become one of the most significant parts of human-computer interaction due to emergence of new technologies such as smartphone, smart watch and many modern technologies, therefore the need of an ASR for local languages is felt. The basic aim of this paper is to develop an isolated digits recognition for Pashto language, using deep CNN. The database of Pashto digits from 0 to 9 with 50 utterance for each digits is used. Twenty MFCC features extracted for each isolated digit and fed as input to CNN. The network has been used for the proposed system is deep up to 4 convolutional layers, followed by ReLU and max-pooling layers. The network has been trained on the 50% of data and the rest of the data was used for testing. The total average of 84.17% accuracy was achieved for testing which show 7.32% better performance as compared to existing similar works.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call