Abstract

It is possible to include complicated structures into an individual syntactic tree, to enhance the usefulness of parsed text corpus. In this part, existing works on Thai treebank construction have been developed in order to address the lack of high-level syntactic resources. However, it has yet to be sufficient for Thai Natural Language Processing. Furthermore, Thai treebanks have either syntactic or dependency structure only. This paper presents a construction of hybrid structural Thai treebank which includes both syntactic/dependency structure, a tool for conversion between constituency and dependency parse tree, and a web-based GUI for parse tree visualization. Towards the hybrid treebank construction, hundreds of constituent tree are manually annotated with predicate header to each phrase. Once the set of annotated constituent trees are obtained, the conversion procedure will be performed by determining the annotated head and its dependents. As our experiments, features of hybrid treebank are extracted and illustrated. Finally, difficulties and issues in constructing the hybrid Thai treebank are discussed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call