Abstract
In this paper, we present the current state-of-the-art models for Automatic Speech Recognition due to a self-supervised training implemented on Hebrew language. The motivation behind using self-supervised learning is that even though we wouldn't probably get the accuracy rates as if we choose a supervised learning, we still can achieve amazing results with relatively low amount of data. This way of training allows us to train a model on unlabeled data (or to use a pre-trained model, which is always more accessible. It’s goal in the first unsupervised phase is to learn some good representations from raw audio samples, which can be useful for speech recognition tasks, without using any label data. Then, the model can be fine-tuned on a particular dataset for a specific purpose. It means that our involvement really occurs in the last layers of the model. This kind of training proved to be very powerful. We present complete framework from model to practice with simulations and training model and present an impressive result on Hebrew.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Transactions on Machine Learning and Artificial Intelligence
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.