Normalization of Ukrainian letters, numerals, and measures for natural language processing

Maksym Vakulenko

doi:10.1093/llc/fqac090

Abstract

Abstract There is still little linguistic data on the Ukrainian language in the context of natural language processing (NLP). At the same time, in view of real perspectives of European integration of Ukraine, its state language will soon probably become one of the European Community languages. This fact draws more and more interest to the possible inclusion of the Ukrainian language in multilingual NLP scenarios. This article aims to provide some important data that are necessary for text-to-speech (TTS) conversion of the Ukrainian language. Relying on Ukrainian phonetics and grammar, we formulate the basic rules governing the normalization of Ukrainian letters, numerals, and measures, which help obtain Ukrainian speech data. The results are presented in tables and strings convenient for coding and further implementation into linguistic tools. These rules are necessary for the TTS tasks arising in various NLP tasks associated with speech synthesis, machine translation, information retrieval, etc.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Normalization of Ukrainian letters, numerals, and measures for natural language processing

Abstract

Talk to us

Similar Papers

More From: Digital Scholarship in the Humanities

Lead the way for us

Journal: Digital Scholarship in the Humanities	Publication Date: Dec 29, 2022
Citations: 2

Similar Papers

Natural Language Processing and Computational Linguistics
Junichi Tsujii
Computational Linguistics | VOL. -
Junichi TsujiiJunichi Tsujii
07 Dec 2021
Computational Linguistics | VOL. -

Enhanced Text Retrieval Using Natural Language Processing
Elizabeth D Liddy
Bulletin of the American Society for Information Science and Technology | VOL. 24
Elizabeth D LiddyElizabeth D Liddy
01 Apr 1998
Bulletin of the American Society for Information Science and Technology | VOL. 24

Graph-Based Natural Language Processing and Information Retrieval Rada Mihalcea and Dragomir Radev (University of North Texas and University of Michigan) Cambridge, UK: Cambridge University Press, 2011, viii+192 pp; hardbound, ISBN 978-0-521-89613-9, $65.00
Chris Biemann
Computational Linguistics | VOL. 38
Chris BiemannChris Biemann
01 Mar 2012
Computational Linguistics | VOL. 38

Information Retrieval in Biomedicine: Natural Language Processing for Knowledge Integration
Martha F Earl
Journal of the Medical Library Association : JMLA | VOL. 98
Martha F EarlMartha F Earl
01 Apr 2010
Journal of the Medical Library Association : JMLA | VOL. 98

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Normalization of Ukrainian letters, numerals, and measures for natural language processing

Abstract

Talk to us

Similar Papers

More From: Digital Scholarship in the Humanities