Abstract

This paper presents a case study of the use of the NINJAL Parsed Corpus of Modern Japanese (NPCMJ) for syntactic research. NPCMJ is the first phrase structure-based treebank for Japanese that is specifically designed for application in linguistic (in addition to NLP) research. After discussing some basic methodological issues pertaining to the use of treebanks for theoretical linguistics research, we introduce our case study on the status of the Coordinate Structure Constraint (CSC) in Japanese, showing that NPCMJ enables us to easily retrieve examples that support one of the key claims of Kubota and Lee (2015): that the CSC should be viewed as a pragmatic, rather than a syntactic constraint. The corpus-based study we conducted moreover revealed a previously unnoticed tendency that was highly relevant for further clarifying the principles governing the empirical data in question. We conclude the paper by briefly discussing some further methodological issues brought up by our case study pertaining to the relationship between linguistic research and corpus development.

Highlights

  • This paper presents a case study of applying the NINJAL Parsed Corpus of Modern Japanese (NPCMJ; http://NPCMJ.ninjal.ac.jp/)2 / LiLT volume 18, issue 3for syntactic research

  • We have presented a case study of using the NPCMJ corpus for theoretical linguistics research

  • In the first part of the paper, we discussed some methodological issues pertaining to the use of treebanks for theoretical research in order to situate the present case study in a larger context

Read more

Summary

Introduction

This paper presents a case study of applying the NINJAL Parsed Corpus of Modern Japanese (NPCMJ; http://NPCMJ.ninjal.ac.jp/)2 / LiLT volume 18, issue 3for syntactic research. The treebank search identified a tendency that is arguably difficult to find by alternative methods (such as introspective judgments and unannotated or lightly annotated linguistic data), but which is relevant for further clarifying the principles governing the empirical data in question. With these results, we hope to convince the reader that treebanks are highly effective tools for addressing questions that have direct theoretical relevance. We include this discussion since we believe that one can exploit the full power of treebanks only by having an accurate knowledge of what they are, and this is why it is important, on the part of users of treebanks, to understand at least some of the key issues involved in the construction of treebanks.

Objectives
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call