Abstract
Turkish speech recognition studies have been accelerated recently. With these efforts, not only available speech and text corpus which can be used in recognition experiments but also proposed new methods to improve accuracy has increased. Agglutinative nature of Turkish causes out of vocabulary (OOV) problem in Large Vocabulary Continuous Speech Recognition (LVCSR) tasks. In order to overcome OOV problem, usage of sub-word units has been proposed. In addition to LVCSR experiments, there have been some efforts to implement a speech recognizer in limited domains such as radiology. In this paper, we will present Turkish speech recognition software, which has been developed by utilizing recent studies. Both interface of software and recognition accuracies in two different test sets will be summarized. The performance of software has been evaluated using radiology and large vocabulary test sets. In order to solve OOV problem practically, we propose to adapt language models using frequent words or sentences. In recognition experiments, 90% and 44% word accuracies have been achieved in radiology and large vocabulary test sets respectively.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.