Abstract

A morphological diverse language will form a huge collection of various types of words. As an agglutinative language, Uyghur language is composed of affixes connecting the front and back of the stem to form a large number of words. Based on the analysis of Uyghur language morphology, the first part focuses on the three language features of Uyghur language, including word formation and ambiguity, cohesion, and phonetic changes. The second part discusses the characteristics, applications and research purposes of Uyghur language morphological analysis. The third part introduces and describes in detail the methods, advantages and disadvantages and their characteristics based on the domestic and foreign research of morphological analysis. The fourth part respectively introduces several Uyghur language stemming methods and corresponding implementation cases, and embodies the characteristics of Uyghur language. The fifth part introduces the concatenated embedding of word embedding and character-level embedding to extract Uyghur language stems through BiLSTM-CRF model. First, obtain the word embedding of each word through the unlabeled Uyghur language corpus. Secondly, obtain the character feature embedding and then directly concatenate to obtain the concatenated embedding representation. Finally, the BiLSTM-CRF model is used to extract the stem of Uyghur language, and the accuracy rate reaches 89.21%. The conclusion part summarizes the word stem extraction, and looks forward to the future development trend of Uyghur language morphological analysis research, and also discusses the development direction of agglutinative language information processing.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.