Abstract

Epochs play an important role in the training of the datasets. The number of epochs decides whether the data is overtrained or not. The results depend on the training of these datasets. Speech recognition is widely used application these days. Speech Recognition with deep learning has changed the perspective of the world to look at the technology. Our project is based on deep learning speech recognition system where there are audio files and text transcripts in the datasets. The audio files are trained with the recognition model, and text transcripts are trained with a language model. The dataset used for this project consists of audio clips recorded in three different environments, namely clean, white noise and continuous noise. It was proved that in a clean environment and more epochs, as the model gets trained more, gives the least Word Error Rate of 14.41%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call