Abstract

In the past seven years, Language Research Institute of Inner Mongolia University has constructed a 500,000-word scale Mongolian dependency treebank. The syntactic treebank provides a favorable data platform for language research and information processing. In order to effectively use the treebank, we have designed and implemented a graphical syntactic information retrieval system based on the Mongolian dependency treebank. As an application system, this retrieval system offers search and statistical analysis on word, phrase, syntactic fragment and syntactic structure level.

Highlights

  • Language Research Institute of Inner Mongolia University has constructed a 1-million-word modern Mongolian corpus in a span of eight years from 1984 to 1991 and expanded it twice into what is a 10-million-word corpus

  • From 2008 to 2011, funded by National Social Science Foundation and National Natural Science Foundation, using the method of automatic parsing and manual proofreading, Language Research Institute of Inner Mongolia University has constructed a 500,000-word Mongolian dependency treebank (MDTB) based on middle school Mongolian textbooks that were extracted from the 1-million-word modern Mongolian corpus [10]

  • MDTB has an annotation set of 17 dependency relations under 5 categories [11]

Read more

Summary

INTRODUCTION

From 2008 to 2011, funded by National Social Science Foundation and National Natural Science Foundation, using the method of automatic parsing and manual proofreading, Language Research Institute of Inner Mongolia University has constructed a 500,000-word Mongolian dependency treebank (MDTB) based on middle school Mongolian textbooks that were extracted from the 1-million-word modern Mongolian corpus [10]. In the form of annotation, MDTB uses two types of labeling, namely the brackets annotation and graphical annotations This treebank contains 461,240 words in 31,722 sentences. Researchers can perform statistical analysis and example sentence extraction. The retrieval functions allow researchers to do enquiry and statistical analysis on word, phrase, sentence constituents, syntactic fragment and syntactic structure

Display Module of the Syntactic Tree
Design of Mongolian Syntactic Retrieval Algorithm
A Syntactic Retrieval Example
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.