Abstract

The current study examined whether the computer annotations of prodody based on Brazil’s (1997) framework were comparable with human annotations. A series of statistical tests were performed for each prosodic feature: tone unit (two accuracy scores and Pearson’s correlation), prominent syllable (accuracy, F-measure, and Cohen’s kappa), tone choice (accuracy and Fleiss' kappa), and relative pitch (accuracy, Fleiss' kappa, and Pearson’s correlation). We considered one population to be the inter-rater reliability scores between the three human coders and the other population to be the inter-rater reliability scores between the computer and the three humans. If the differences between these two populations were significant, then the computer and human annotations were considered not comparable, but if the differences were not significant, then the computer and human annotations were considered comparable. The results indicated that the computer and human annotations were comparable for tone choice and not comparable for prominent syllable. For tone unit, two of the t-tests provided evidence that they were comparable and one did not. The relative pitch t-tests showed a significant disparity between the estimates of relative pitch by the humans and the computer’s actual relative pitch calculation.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call