Abstract
Natural language understanding (NLU) is a core technique for implementing natural user interfaces. In this study, we propose a neural network architecture to learn syntax vector representation by employing the correspondence between texts and syntactic structures. For representing the syntactic structures of sentences, we used three methods: dependency trees, phrase structure trees, and part of speech tagging. A pretrained transformer is used to propose text-to-vector and syntax-to-vector projection approaches. The texts and syntactic structures are projected onto a common vector space, and the distances between the two vectors are minimized according to the correspondence property to learn the syntax representation. We conducted massive experiments to verify the effectiveness of the proposed methodology using Korean corpora, i.e., Weather, Navi, and Rest, and English corpora, i.e., the ATIS, SNIPS, Simulated Dialogue-Movie, Simulated Dialogue-Restaurant, and NLU-Evaluation datasets. Through the experiments, we concluded that our model is quite effective in capturing a syntactic representation and the learned syntax vector representations are useful.
Highlights
Natural language understanding (NLU) is a core technique for implementing a dialogue interface and information extraction
To learn the syntax representation, the text and corresponding syntactic structure are projected onto a common vector space, and the distances between the two vectors are minimized according to the correspondence property
We confirmed that the text vectors and the syntax vectors are effectively learned in the vector space
Summary
Natural language understanding (NLU) is a core technique for implementing a dialogue interface and information extraction. The goal of NLU is to extract meaning from natural language and infer user intention. For a deeper and more correct understanding of natural language, a syntax analysis should be conducted at the same time as a semantic analysis because the meaning of a sentence can change significantly from a slight syntactical change. Deep learning based NLU techniques have recently shown tremendous success in learning and projecting the embeddings of the semantic and syntax of a text. There have been many attempts to reduce the number of learning parameters to make BERT lighter or present new methods of BERT, such as the Robustly Optimized BERT Pretraining Approach (RoBERTa) [2] and A Lite BERT for Self-supervised Learning of Language Representation (ALBERT) [3].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.