Abstract
This paper presents an in-depth analysis of the use of vocabulary covered by the Corpus of Taiwanese Learner of Japanese. Our method consists, firstly, in applying the Japanese morphological analyzer, MeCab, to segment vocabularies of the original writings in Japanese in CTLJ, and then proceeding with morpheme-level analysis of errors in grammar and usage, which process has been repeated twice in the recent three years. In order to highlight the words characteristic of the Taiwanese Learners' Japanese, comparisons are made between CTLJ and a corpus of current Japanese, which have been constructed by the author. The result indicates that the number of morpheme tokens used in the original students' essays in Japanese in CTLJ is more than 390 thousand, or around 13 thousand morpheme types. The number of nouns amounts to 7,400, which accounts for 57.2% of morpheme types. The number of verbs is 3,100 (24.2%). In addition, comparisons between CTLJ and the above-mentioned natural corpus help the instructors to grasp the actual situations of how the learners use and reveal what sort of items are particularly prone to errors, thereby enabling them to provide apt and systematic instructions to the learners.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.