Abstract

We examine a methodology using neural language models (LMs) for analyzing the word order of language. This LM-based method has the potential to overcome the difficulties existing methods face, such as the propagation of preprocessor errors in count-based methods. In this study, we explore whether the LM-based method is valid for analyzing the word order. As a case study, this study focuses on Japanese due to its complex and flexible word order. To validate the LM-based method, we test (i) parallels between LMs and human word order preference, and (ii) consistency of the results obtained using the LM-based method with previous linguistic studies. Through our experiments, we tentatively conclude that LMs display sufficient word order knowledge for usage as an analysis tool. Finally, using the LM-based method, we demonstrate the relationship between the canonical word order and topicalization, which had yet to be analyzed by large-scale experiments.

Highlights

  • Speakers sometimes have a range of options for word order in conveying a similar meaning

  • We demonstrate the validity of the language models (LMs)-based method by showcasing two types of parallels: (i) word order preference of LMs showing parallels with that of humans, and (ii) the results obtained with the LM-based method and those with previous methods being consistent on various claims on canonical word order

  • The word orders supported by character-based LM (CLM) and subword-based LM (SLM) are highly correlated with workers, with the Pearson correlation coefficient of 0.89 and 0.90, respectively

Read more

Summary

Introduction

Speakers sometimes have a range of options for word order in conveying a similar meaning. A teacher gave a book to a student Even for such a particular alternation, several studies (Bresnan et al, 2007; Hovav and Levin, 2008; Colleman, 2009) investigated the factors determining this word order and found that the choice is not random. For analyzing such linguistic phenomena, linguists repeat the cycle of constructing hypotheses and testing their validity, usually through psychological experiments or count-based methods. These approaches sometimes face difficulties, such as scalability issues in psychological order!: 質に 影響を 与えた

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.