Hours Of Speech Data Research Articles

Mandarin in Taiwan is notably different from other variants of Mandarin in terms of lexical use and accents. However, from an investment perspective, it remains debated whether the general-purpose Mandarin speech recognition (MSR) systems are sufficient for supporting human-computer interaction in Taiwan. In addressing this question, we established the Formosa (an ancient name of Taiwan given by the Portuguese) Speech in the Wild (FSW) (Liao 2018) project to (1) collect large-scale Taiwanese Mandarin speech to boost Taiwanese-specific MSR technique development, and (2) host a Formosa Speech Recognition (FSR) challenge (Liao 2018) to promote the corpus as well as to evaluate the performance of the available Taiwanese-specific MSR systems. The FSW project has focused on transcribing spontaneous Taiwanese Mandarin speech selected from real-life, multi-genre broadcast radio speech provided by Taiwan’s National Education Radio (2018). We plan to publicly release about 3000 hours of speech data at the end of 2019. FSR-2018 (Liao 2018) was the culmination of FSW’s events in the year 2018, which featured a Taiwanese broadcast Mandarin speech recognition evaluation campaign using released corpora. The challenge was also an official activity (Liao 2018) of the 11th International Symposium on Chinese Spoken Language Processing (ISCSLP) [22]. At the end of 2018, the first 4 volumes of the FSW Corpus, NER-Trs-Vol1∼4, a total of 610.2 hours of speech data, were released to support two events, Formosa Grand Challenge, Talk to AI (FGC) (Ministry of Science And Technology Taiwan 2018) (Dec. 2017 ∼ Mar. 2019) and FSR-2018 challenge (Liao 2018) (June 2018 ∼ Nov. 2018), which had 147 and 27 participating teams respectively. For FSR-2018, 30 recognition results on the final-test set were submitted by 16 teams. The evaluation results revealed that the best Taiwanese-specific MSR system achieved an 8.1% Chinese character error rate (CER). As reference, the performances of iFlyTek’s (ISCSLP 2018) and Google’s (2018) commercial MSR systems which were not optimized for this task were 18.8% and 20.6% CERs, respectively. Taken together, we argued that a Taiwanese-specific MSR system is necessary for improving the performance of Taiwanese Mandarin speech-enabled human-computer interaction.

Read full abstract

Background: As agrammatism is, at least partly, an adaptive behaviour, we investigated how compensation strategies manifest themselves in agrammatic performance. Aims: Within a functional theoretical framework of language use we conducted an in-depth exploration of across-task variability in agrammatic patients' oral production, submitting the morphosyntactic properties of their utterances to quantitative and qualitative analyses. Methods & Procedures: We designed an original data collection protocol comprising four tasks with increasing situational constraints (gradual manipulation of two external factors: use of instructions and presence or absence of pictures), in order to prompt the production of (i) spontaneous speech, (ii) narrative speech, (iii) descriptive speech and, finally, (iv) isolated sentences. We administered the tasks to six French-speaking agrammatic patients and nine normal controls, yielding the equivalent of 9 hours of speech data. We then conducted a multi-level and fine-grained analysis of the agrammatic and control corpora to assess oral production, entailing both morphological (open- vs closed-class word distribution, frequency of determiners, verb inflections) and syntactic aspects (sententials, non-sententials, well-formedness). Outcomes & Results: Results revealed across-task variability, suggesting that participants tended to adjust the morphosyntactic aspects of their speech according to task-dependent factors. Moreover, trade-offs were observed between morphosyntactic accuracy and oral fluency (i.e., speech rate), further pointing to the agrammatic patients' ability to gradually vary grammatical accuracy according to task constraints, rather than as a function of the limited online processing resources available to them. Results showed that agrammatic speakers used a variety of strategies to improve or reduce their grammatical accuracy according to task constraints. Conclusions: Agrammatic speakers rely excessively on ellipsis in spontaneous speech, and on corrective or monitoring strategies in elicited speech. Thus adaptation strategies vary from one task to another, depending on the type of speech to be produced (connected vs disconnected) and monitoring factors (attention allocated to formal encoding). Finally this study confirms the usefulness of functional and compensation-oriented therapies in aiding recovery from agrammatic aphasia.

Read full abstract

Hours Of Speech Data Research Articles

Articles published on Hours Of Speech Data

Speech data collection system for KUI, a Low resourced tribal language

Investigating the feasibility of harvesting broadcast speech data to develop resources for South African languages

Formosa Speech in the Wild Corpus for Improving Taiwanese Mandarin Speech-Enabled Human-Computer Interaction

UTDallas-PLTL:Advancing multi-stream speech processing for interaction assessment in peer-led team learning

Across-task variability in agrammatic performance

Speech database development at MIT: Timit and beyond

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Hours Of Speech Data Research Articles

Articles published on Hours Of Speech Data

Speech data collection system for KUI, a Low resourced tribal language

Investigating the feasibility of harvesting broadcast speech data to develop resources for South African languages

Formosa Speech in the Wild Corpus for Improving Taiwanese Mandarin Speech-Enabled Human-Computer Interaction

UTDallas-PLTL:Advancing multi-stream speech processing for interaction assessment in peer-led team learning

Across-task variability in agrammatic performance

Speech database development at MIT: Timit and beyond