The First Azeri (Azerbaijani) Language Next Word Predictor

Ali Pourmohammad ,Mensur Gulami ,Javid Mahmudov ,Yusif Aliyev ,Rovshan Akberov ,Anar Sultani

doi:10.23977/isspj.2020.51001

Ali Pourmohammad , Mensur Gulami + Show 4 more

Open Access

https://doi.org/10.23977/isspj.2020.51001

Copy DOI

Abstract

Azeri (Azerbaijani) language is one of the more than 50 Turkic languages which it is a little studied language in terms of using the modern signal processing algorithms. This paper tackles the problem of Hidden Markov Models (HMMs) based next word prediction for this language based on Natural Language Processing (NLP) principles using Python high-level programming language. The software is included a small Azeri vocabulary database, the various Python libraries, a HMM model and a Web based interface. In this research, the database was constructed by a predictor parser which it was implemented for the first time for Azeri language. The database was concluded by the most general Azeri language words to introduce HMMs based generated word pairs. The Model was trained by 90% of the database, hence, predicting the next 5 words on the test data resulted 54% accuracy.

Highlights

Azeri (Azerbaijani) language is one of the more than 50 Turkic languages [1] which it is a little studied language in terms of using modern signal processing algorithms and creation of modern language technology applications [2]. despite a huge number of researches on the other languages since the 80th years of the last century, Azeri language is a little investigated language, where all those researches studied applying Automatic Speech Recognition (ASR), Text-To-Speech (TTS) or Authorship Recognition (AR) algorithms on this language as “Dilmanc” project [2,3,4,5,6].For the first time, the word prediction for Azeri language has been mentioned in this research
This paper tackles the problem of Hidden Markov Models (HMMs) based word prediction for this language based on Natural Language Processing (NLP) principles using Python high-level programming language
The database was constructed by a predictor parser which it was implemented for the first time for Azeri language

Summary

INTRODUCTION

Azeri (Azerbaijani) language is one of the more than 50 Turkic languages [1] which it is a little studied language in terms of using modern signal processing algorithms and creation of modern language technology applications [2]. despite a huge number of researches on the other languages since the 80th years of the last century, Azeri language is a little investigated language, where all those researches studied applying Automatic Speech Recognition (ASR), Text-To-Speech (TTS) or Authorship Recognition (AR) algorithms on this language as “Dilmanc” project [2,3,4,5,6]. The word prediction for Azeri language has been mentioned in this research. Reducing the time consumption for typing in the electronically communications by means of the word prediction, would be very helpful for day to day usage. During the last decade, one of the highly discussed topics in Natural Language Processing research domain was the word prediction for typing in the electronically communications [7]. It will be shortly reviewed HMMs and will be discussed the training on the model. It will be explained the collection of the database issue.

HMMS AND THE TRAINING ON THE MODEL

COLLECTION OF THE DATABASE

THE SOFTWARE

SOME EXPERIMENTAL RESULTS

CONCLUSIONS

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

The First Azeri (Azerbaijani) Language Next Word Predictor

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information Systems and Signal Processing Journal

Lead the way for us

Journal: Information Systems and Signal Processing Journal	Publication Date: Jan 1, 2020
License type: cc-by

Similar Papers

Optimisation of HMM topology and its model parameters by genetic algorithms
S Kwong ... K.S Tang
Pattern Recognition | VOL. 34
S Kwong, et. al.S Kwong ... K.S Tang
01 Feb 2001
Pattern Recognition | VOL. 34

Extracting bibliographical data for PDF documents with HMM and external resources
Wen-Feng Hsiao ... Erwin Thomas
Program | VOL. 48
Wen-Feng Hsiao, et. al.Wen-Feng Hsiao ... Erwin Thomas
01 Jul 2014
Program | VOL. 48

Recognition of Isolated Digits Using HMM and Harmonic Noise Model
A Guerid ... A Houacine
-
A Guerid, et. al.A Guerid ... A Houacine
01 Apr 2018
01 Apr 2018

Towards Automated Construction Quantity Take-Off: An Integrated Approach to Information Extraction from Work Descriptions
Shengxian Tang ... Zhen Lei
Buildings | VOL. 12
Shengxian Tang, et. al.Shengxian Tang ... Zhen Lei
15 Mar 2022
Buildings | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The First Azeri (Azerbaijani) Language Next Word Predictor

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information Systems and Signal Processing Journal