Compounding words in the syntax can produce phrasal phonology: Evidence from Japanese Aoyagi morphemes
Abstract Various proposals have been made to account for mismatches between syntax and prosody in natural languages. Prosodic prespecification (i.e., prosodic subcategorization) attributes such mismatches to morpheme-specific prosodic requirements (Bennett et al. 2018; Tyler 2019). On the other hand, Hsu (2019) and Revithiadou and Markopoulos (2021) argue that some patterns previously analyzed through subcategorization can instead be captured in Gradient Harmonic Grammar (Smolensky and Goldrick 2016) without a syntax-prosody mismatch. This paper contributes to the ongoing discussion about the syntax-prosody mismatch by addressing ‘Aoyagi prefixes’ in Japanese (e.g., gen ‘current’ in gen-daijin ‘current Minister’). While the ‘word-internal phrase boundary’ associated with Aoyagi morphemes has been attributed to prosodic subcategorization (Poser 1990), I argue that such subcategorization is unnecessary. The key evidence lies in the fact that all Aoyagi morphemes are accented. Vance (2008) and Ito and Mester (2013) independently observe that a prosodic phrase boundary emerges between the first and second elements only when the first element is an accented prosodic word in Japanese. Building on this correlation between accent and prosodic phrasing, I put forward an alternative analysis: first, I propose that Aoyagi morphemes are not prefixes, but the size of syntactic word (X0), such that the entire Aoyagi construction should be analyzed as a syntactic compound [x0 X0 X0] (Booij 2010). Given this structure, their prosodic behavior follows from an XP to φ mapping system (Ito and Mester 2013), where constraints on accent placement play a crucial role in mapping syntactic heads to phonological phrases, overriding Match constraints.
- Research Article
76
- 10.1353/lan.2014.0105
- Dec 1, 2014
- Language
It is a well-known fact that across the world’s languages there is a fairly strong asymmetry in the affixation of grammatical material, in that suffixes considerably outnumber prefixes in typological databases. This article argues that prosody, specifically prosodic phrasing, plays an important part in bringing about this asymmetry. Prosodic word and phrase boundaries may occur after a clitic function word preceding its lexical host with sufficient frequency so as to impede the fusion required for affixhood. Conversely, prosodic boundaries rarely, if ever, occur between a lexical host and a clitic function word following it. Hence, prosody does not impede the fusion process between lexical hosts and postposed function words, which therefore become affixes more easily. Evidence for the asymmetry in prosodic phrasing is provided from two sources: disfluencies, and ditropic cliticization, that is, the fact that grammatical proclitics may be phonological enclitics (i.e. phrased with a preceding host), but grammatical enclitics are never phonological proclitics. Earlier explanations for the suffixing preference have neglected prosody almost completely and thus also missed the related asymmetry in ditropic cliticization. More importantly, the evidence from prosodic phrasing suggests a new venue for explaining the suffixing preference. The asymmetry in prosodic phrasing, which, according to the hypothesis proposed here, is a major factor underlying the suffixing preference, has a natural basis in the mechanics of turn-taking as well as in the mechanics of speech production.*
- Research Article
- 10.13064/ksss.2013.5.1.099
- Mar 31, 2013
- Phonetics and Speech Sciences
Previous laboratory studies have shown that prosodic structures are encoded in the modulations of phonetic patterns of speech including suprasegmental as well as segmental features. In particular, effects of prosodic context on duration and intensity of syllables and words have been widely reported. Drawing on prosodically annotated large-scale speech data from the Buckeye corpus of conversational speech of American English, the current study attempted to examine whether and how prosodic prominence and phrase boundary of everyday conversational speech, as determined by a large group of ordinary listeners, are related to the phonetic realization of duration and intensity. The results showed that the patterns of word durations and intensities are influenced by prosodic structure. Closer examinations revealed, however, that the effects of prosodic prominence are not the same as those of prosodic phrase boundary. With regard to intensity measures, the results revealed the systematic changes in the patterns of overall RMS intensity near prosodic phrase boundary but the prominence effects are restricted to the nucleus. In terms of duration measures, both prosodic prominence and phrase boundary are the most closely related to the lengthening of the nucleus. Yet, prosodic prominence is more closely related to the lengthening of the onset while phrase boundary lengthens the coda duration more. The findings from the current study suggest that the phonetic realizations of prosodic prominence are different from those of prosodic phrase boundary, and speakers signal different prosodic structures through deliberate modulations of the internal phonetic structure of words and listeners attend to such phonetic variations.
- Research Article
13
- 10.1016/j.neuroscience.2010.03.069
- Apr 9, 2010
- Neuroscience
Perception of Chinese poem and its electrophysiological effects
- Conference Article
- 10.1109/icsda.2013.6709909
- Nov 1, 2013
This paper describes a technique for detection of prosodic word and phrase boundary for Bangla language readout speech based on the Empirical mode of Decomposition(EMD) of Fcontour. In this method F <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> contour of the sentence is extracted using the open source software “Praat” and then a continuous F <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> contour is generated using interpolation. Empirical Mode of Decomposition operates on continuous logarithmic F <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> contour to decompose into a set of IMF(Intrinsic Mode Function) components. The sum of DC component and the component before DC gives the information about global variation or phrase component. It is observed that the IMF having the most energy gives the idea about accent component or local variation. In total 150 Bangla readout sentences, 724 lexical words out of which 526 prosodic words containing 137 two words together (in this case each prosodic word contains two lexical words or syntactic words) and 31 three words (in this case to form a prosodic word it contains three lexical or syntactic words) together, are analyzed in this study. The correct detection of prosodic word boundary from the onset time is within .091ms with 6% insertion errors. The results of EMD analysis are then compared with the Bangla grammar and Fujisaki model, which are satisfactory. With the help of these word and phrase boundary, this can be a way to analysis and synthesis for F <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> contour with the help of Fujisaki model parameters in future.
- Research Article
1
- 10.1155/2022/6438843
- May 14, 2022
- Wireless Communications and Mobile Computing
Nonnative Mandarin speakers always have some unnatural pauses when speaking Mandarin due to their native pronunciation habits. Accurately predicting the prosodic structure of Chinese sentences is the key to improving fluency in Mandarin for nonnative speakers. This paper investigated the influence of the Chinese prosodic boundaries on the Mandarin spoken by international students. First, we proposed a new method to predict the prosodic word and prosodic phrase boundaries from Chinese sentences to obtain the prosodic boundaries automatically. Then, we used the predicted results to improve the Mandarin spoken fluency of international students. To train the prosodic boundary prediction model, we firstly constructed a Chinese prosodic boundary corpus that includes 100,000 Chinese sentences with manually labeled prosodic boundaries under the guidance of a linguist. We also proposed an end-to-end Chinese prosodic boundary prediction model based on the sequence-to-sequence model with a new feature named number of syntax hierarchy (NSH). Finally, we assess the fluency score of Mandarin using 1300 utterances recorded by six international students and a native Mandarin speaker. The utterances are recorded without/with the predicted prosodic boundaries. The experimental results show that the F 1 scores of the prosodic word prediction model and the prosodic phrase prediction model are 98.14% and 85.24%, respectively. The fluency assessment results show that the fluency score labeled with the prosodic boundaries is higher than the fluency score of the international students when they read freely. The improvement of the score is between 7 and 16. Therefore, our method can be applied to the Mandarin education system to improve the spoken Mandarin fluency of nonnative speakers.
- Conference Article
- 10.21437/interspeech.2011-456
- Aug 27, 2011
We proposed an automatic method for determining the boundaries of prosodic phrases in real speech waves. In this method, the dynamic programming ( DP ) and the least mean square error ( LMSE ) methods were implemented based on the F0 generation model. In order to evaluate the accuracy and validity of this proposed method, a set of 973standard Chinese speech sentences was selected. The cumulative proportion of the estimated prosodic phrase boundaries approached 76% when ET(0i) was less than the average duration of the prosodic phrases. Thus, it can be concluded that this proposed method can be used in the practical application. Index Terms: Fujisaki model, prosodic hierarchy, prosodic phrase, prosodic word, dynamic programming, least mean square error, Standard Chinese
- Conference Article
2
- 10.1109/snpd.2015.7176201
- Jun 1, 2015
Prosodic structure contributes to speech production and comprehension. One of the crucial problems in achieving natural-sounding synthesized speech is the prediction of appropriate phrase boundaries. Unfortunately, obtaining human annotations of prosodic phrases to train a supervised system can be laborious and costly. Active learning has been proven effective in reducing labeling efforts for supervised learning. This study explores active learning techniques with the objective to reduce the amount of human-annotated data needed to attain a given level of performance. It presents an approach based on active learning to predict the Chinese prosodic phrase boundaries in unrestricted Chinese text. Experiments show that for most of the cases considered, the active selection strategies for labeling the prosodic phrase boundaries are as good as or exceed the performance of random data selection.
- Research Article
46
- 10.1016/j.neuroscience.2008.10.065
- Nov 24, 2008
- Neuroscience
Perception of prosodic hierarchical boundaries in Mandarin Chinese sentences
- Research Article
12
- 10.1371/journal.pone.0155300
- May 18, 2016
- PLOS ONE
The processing of prosodic phrase boundaries in language is immediately reflected by a specific event-related potential component called the Closure Positive Shift (CPS). A component somewhat reminiscent of the CPS in language has also been reported for musical phrases (i.e., the so-called ‘music CPS’). However, in previous studies the quantification of the music-CPS as well as its morphology and timing differed substantially from the characteristics of the language-CPS. Therefore, the degree of correspondence between cognitive mechanisms of phrasing in music and in language has remained questionable. Here, we probed the shared nature of mechanisms underlying musical and prosodic phrasing by (1) investigating whether the music-CPS is present at phrase boundary positions where the language-CPS has been originally reported (i.e., at the onset of the pause between phrases), and (2) comparing the CPS in music and in language in non-musicians and professional musicians. For the first time, we report a positive shift at the onset of musical phrase boundaries that strongly resembles the language-CPS and argue that the post-boundary ‘music-CPS’ of previous studies may be an entirely distinct ERP component. Moreover, the language-CPS in musicians was found to be less prominent than in non-musicians, suggesting more efficient processing of prosodic phrases in language as a result of higher musical expertise.
- Research Article
38
- 10.1016/s1007-0214(08)70080-4
- Jul 25, 2008
- Tsinghua Science and Technology
Pause or No Pause?—Prosodic Phrase Boundaries Revisited
- Research Article
- 10.5842/62-0-891
- Aug 1, 2021
- - Stellenbosch Papers in Linguistics Plus
This study investigates the segmental lengthening patterns resulting from prosodic boundaries in Tswana, a Southern Bantu language. The aim is to shed light on the interaction between Penultimate Lengthening and Final Lengthening, providing the first quantitative investigation of these phenomena in Tswana. We conducted a production experiment that applies a widely tested design to elicit production data of two different phrasal structures in coordinated noun phrases. The results suggest that Penultimate Lengthening and Final Lengthening constitute independent mechanisms, which both apply in Tswana. Penultimate Lengthening occurs before prosodic phrase boundaries as well as before word boundaries, yet at differing degrees. Before phrase boundaries, it involves a strong lengthening effect on the vowel of the penultimate syllable. Before word boundaries, the amount of lengthening is smaller. Final lengthening operates on the final syllable before a phrase boundary, involving a larger amount on the final vowel than on the preceding consonant. This pattern is in line with the pattern observed in other languages. The amount of lengthening on the final vowel is comparable to the amount on the penultimate vowel. Given that a large increase of lengthening on the penultimate syllable has not been observed in connection with Final Lengthening, we assume that Penultimate Lengthening constitutes a language-specific mechanism that applies independently. Final Lengthening, on the other hand, might be a universal phenomenon. The perceptual salience of Penultimate Lengthening, which has been widely reported in the literature for Bantu languages, might have to do with the dynamics within the lengthening domains, namely that the lengthening in penultimate position is abrupt and relatively stronger than in final position when compared to the preceding syllable.
- Conference Article
1
- 10.1109/sped.2015.7343090
- Oct 1, 2015
This paper proposes a framework for developing an automatic annotation tool of Romanian prosody for spontaneous and reading speech and a set of acoustic cues at the prosodic word level, necessary to accurately discriminate the prosodic phrases. Even though many approaches have considered the silence pause as an important acoustic cue in the automatic detection of the prosodic phrase boundaries, our research results show that listeners perceive prosodic boundaries mainly through the embodiment of F0 reset and tonal contrasts between adjacent words. The silence pauses in spontaneous conversational and reading speech help to locate the prosodic boundaries only when they are accompanied by F0 and energy cues. To discriminate the prosodic phrases, we extracted the following acoustic features for each prosodic word: minimum, mean, maximum, standard deviation and regression of F0 and energy. Using these acoustic features led to 90% accuracy in prosodic phrases discrimination.
- Conference Article
4
- 10.1109/itnec48623.2020.9084900
- May 5, 2020
The prediction of prosodic structure of sentences is the key for improving the naturalness of Mandarin speech synthesis. In this paper, we proposed a sequence-to-sequence (seq2seq) model-based method to improve the predictive accuracy of the prosodic boundaries from Chinese sentence. A large-scale text corpus including 100,000 Chinese sentences is collected that is manually labelled the part-of-speech and the boundaries of the prosodic words and prosodic phrases under the guidance of a linguistic expert. By analyzing the text corpus, the shallow features such as part-of-speech, word length and word embedding are selected as the input features of the seq2seq model. At the same time, a new deep feature named syntactic hierarchical number (SHN) is proposed to predict the boundary of prosodic phrases, which describes the relationship between syntactic structure and prosodic structure. Finally, we get the seq2seq model by training the labelled text corpus to predict the boundaries of prosodic words and prosodic phrases. The experimental results show that the seq2seq model achieves F1-score of 97.15% in prosodic word and 82.98% in prosodic phrase boundary prediction. Compared to the other models, our proposed method are more effective on the prediction of prosodic structure, which can be applied to the front-end of speech synthesis.
- Conference Article
12
- 10.1109/icassp.2013.6639316
- May 1, 2013
We describe models of prosodic phrasing trained on multiple languages to identify boundaries in an unseen language. Our goal is to create models from High Resource languages, in which hand-annotated prosodic phrase boundaries are available, to use in identifying boundaries in a Low Resource language, with little or no training material. We train models on American English, Italian, Mandarin, and German and test on each of these languages. We find that, while pause is the most important feature for phrase boundary prediction in all languages examined, the role of pause in boundary identification varies by annotator and the relative importance of other features varies significantly by language. We also find that different acoustic correlates of prosodic boundaries characterize different languages. In some, the relative importance of features is silence > pitch > intensity > duration, while for other languages intensity is more important than pitch. These differences do not appear to be attributable to language family, since, e.g. English and German display different patterns.
- Research Article
8
- 10.1145/1349332.1349337
- Sep 1, 2007
- XRDS: Crossroads, The ACM Magazine for Students
Prosodic phrasing is the means by which speakers of any given language break up an utterance into meaningful chunks. The term "prosody" itself refers to the tune or intonation of an utterance, and therefore prosodic phrases literally signal the end of one tune and the beginning of another. This study uses phrase break annotations in the Aix-MARSEC corpus of spoken English as a "gold standard" for measuring the degree of correspondence between prosodic phrases and the discrete syntactic grouping of prepositional phrases, where the latter is defined via a chunk parsing rule using nltk_lite's regular expression chunk parser. A three-way comparison is also introduced between the "gold standard" chunk parsing rule and human judgment in the form of intuitive predictions about phrasing. Results show that even with a discrete syntactic grouping and a small sample of text, problems may arise for this rule-based method due to uncategorical behavior in parts of speech. Lack of correspondence between intuitive prosodic phrases and corpus annotations highlights the optional nature of certain boundary types. Finally, there are clear indications, supported by corpus annotations, that significant prosodic phrase boundaries occur within sentences and not just at full stops.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.