Abstract

The Objectives of this study are (1) to evaluate tone production in Mandarin-speaking patients with post-stroke dysarthria (PSD) using an artificial neural network (ANN), (2) to investigate the efficacy of recognition performance of the ANN model contrast to the human listeners and the convolutional neural network (CNN) model, and (3) to explore rehabilitation application of the artificial intelligence recognition for lexical tone production disorder with PSD. The subjects include two groups of native Mandarin speaking adults: 31 patients with PSD and 42 normal-speaking adults (NA) in a similar age range as controls. Each subject was recorded producing a list of 7 Mandarin monosyllables with 4 tones (i.e., a total of 28 tokens). The fundamental frequency (F0) of each monosyllable was extracted using auto-correlation algorithm. The ANN was trained with F0 data of the tone tokens from the NA, to generate the final model. The recognition rates of the human ears, ANN model, and CNN model were 87.78% ± 8.96% (mean ± SD), 89.11% ±11.80%, 65.91% ± 8.79% respectively for tone production of NA group; 70.28% ± 17.61%, 63.35% ± 17.40%, 34.71% ± 6.92% respectively for tone production of PSD group. For PSD group, there was significant correlation between the performance of the ANN model and human listeners (r = 0.826, P <; 0.001). However, the performance of CNN model was not correlated with that of the human ears (r = -0.108, P = 0.562). Thus, the experiments show that ANN is more objective and efficient, which could replace human listeners in the assessment of lexical tone production disorder in Mandarin-speaking patients with PSD. Furthermore, using ANN may reduce the heterogeneity of rehabilitation evaluation among different speech therapists and may give the feedback for achievement of rehabilitation treatment more accurately.

Highlights

  • Mandarin Chinese, which is spoken by the largest population in the world, is a tonal language different from English or other alphabetic languages

  • The results showed that a majority of these children did not produce Mandarin tones very well, and there were evidences indicating that those cochlear implant (CI) children who spoke tonal language had remarkable deficits in tone perception as well [21]–[24]

  • The result of human listeners by artificial neural network (ANN) or by human ears, was visibly larger than that of the normal adults, but the case was not the same when judged by convolutional neural network (CNN)

Read more

Summary

Introduction

Mandarin Chinese, which is spoken by the largest population in the world, is a tonal language different from English or other alphabetic languages. Mandarin tones convey lexical meanings based on the pitch variation patterns, which means. The associate editor coordinating the review of this manuscript and approving it for publication was Mohamed Elhoseny. One syllable could have different meanings when it has been spoken out with different tones. There are four tones in Mandarin, which are Tone 1, 2, 3, and 4. Mandarin tone patterns are determined by the fundamental frequency (F0) variation of a syllable. Tone 1 has a flat and high F0 contour. The F0 contour of Tone 3 is like V-shape, falling at the beginning followed by a rising, with a dip in the middle.

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call