Abstract

The present study investigated how Japanese dialects in Chubu region are classified using random forest (RF) and an adaptive synthetic (ADASYN) sampling approach. We obtained the written-format pronunciations of 22 words from 884 Japanese speakers (average age = 71.1) who had been residing in their birthplaces. We calculated phonetic distance (ALINE distance) between the dialectal and standard Japanese pronunciations, and ran RF models with 1,000 bootstrap samples. The results of the RF models (ROC = 0.953, F1 = 0.68) demonstrated that speakers with a predicted probability greater than 50 percent (n = 415) were generally located in their residing prefectures, suggesting that each prefecture has its own dialect. Speakers with a predicted probability less than 50 percent were located in both their residing prefectures and the vicinities of the prefectures, particularly those in Aichi, Gifu, Shizuoka, Nagano, Gunma, and Niigata prefectures, suggesting that the speakers in the areas share some characteristics of the prefectural dialects. This potentially led to the poor predictions of the speakers. The expansion of these prefectural dialects seems to be limited by the Japanese Alps; dialects spoken on the eastern and western side of the mountain range are generally widespread only in the area, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call