Abstract

Abstract Language identification is a great challenge in language engineering, which arises along with the tasks of speech recognition, machine translation, cross-language information retrieval, intelligent dialogue system creation, etc. The presented article introduces the intelligent language identification technology, which is based on speech recognition and statistical methods of spectrogram analysis. The approach to the automatic identification of the spoken language sample uploaded to the system, in particular from video streaming services such as YouTube, is put forward. The article focuses on the automatic identification of spoken language, taking into account several speech recognition solutions for correct or incorrect speech recognition and its conversion into correct or incorrect text. The obtained algorithm is demonstrated in the Ukrainian and Russian languages. The identification quality of the language of an utterance, which lasts >30 s is almost 100%, and for the utterance of a duration of 30 s, the quality is 98%, and for the 5-s utterance, it reaches 89.6%. In addition to that, the system performance is contingent on the streaming speed, so it is a real-time system.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.