Abstract

The low frequency (LF) spectral analysis or ‘rhythm spectrum’ approach to the quantitative analysis and comparison of speech rhythms is extended beyond syllable or word rhythms to ‘rhetorical rhythms’ in read-aloud narratives, in a selection of exploratory scenarios, with the aim of developing a unified theory of speech rhythms. Current methodologies in the field are first discussed, then the choice of data is motivated and themodulation-theoreticrhythm spectrum and rhythm spectrogram approach is applied to the amplitude modulation (AM) and frequency modulation (FM) of speech. New concepts ofrhythm formant,rhythm spectrogramandrhythm formant trajectoryare introduced in theRhythm Formant Theory(RFT) framework with its associated methodologyRhythm Formant Analysis(RFA) in order to capture second order regularities in the temporal variation of rhythms. The interaction of AM and FM rhythm factors is explored, contrasting English with Mandarin Chinese. The LF rhythm spectrogram is introduced in order to recover temporal information about long-term rhythms, and to investigate the configurative function of rhythm. The trajectory of highest magnitude frequencies through the component spectra of the LF spectrogram is extracted and applied in classifying readings in different languages and individual speaking styles using distance-based hierarchical clustering, and the existence of long-term second order ‘rhythms of rhythm’ in long narratives is shown. In the conclusion, pointers are given to the extension of this exploratory RFT rhythm approach for future quantitative confirmatory investigations.

Highlights

  • A standard procedure in many quantitative phonetic analyses has been the top–down manual orautomatic ANNOTATION METHOD of signal–symbol interface analysis by measuring and recording the ALIGNMENT of linguistically defined event tokens with points or intervals in the speech signal, recorded as timestamps paired with transcriptions

  • There are many F0 estimation (‘pitch’ tracking) algorithms; the algorithm used in the present study is a time-domain AVERAGE MAGNITUDE DIFFERENCE FUNCTION (AMDF) algorithm (Krause 1984), related to autocorrelation, which searches for regular period durations and converts them to frequencies

  • The Speech Modulation Scale is based on a modification of ideas from Cowell’s (1930) classic theory of harmonic relations in musicology and shows the places of both high frequency (HF) phone formants and low frequency (LF) rhythm formants, as well as the carrier wave, quite straightforwardly on a logarithmic scale of the frequency ranges used in human speech

Read more

Summary

Time domain and frequency domain methods

Speech rhythms have been a field of enquiry since antiquity, yet there are still many open questions. The general objective is to provide a unified approach to describing and comparing rhythms in different prosodic domains This understanding of speech rhythms is close to the common-sense understanding of rhythm as regularly occurring waves and beats. The intent of the study is to explore the potential of RFT and RFA in developing an integrated framework for rhythm analysis at this stage, rather than to deploy ad hoc method combinations in a more conservative confirmatory approach

Basics of RFT
Linguistic–phonetic interface analysis in the time domain
Frequency domain and time domain approximation approaches
Method
The modulation-theoretic framework and RFA methodology
RFA procedure
Downsampling
FM demodulation
LF spectrum analysis
LF rhythm formant identification
LF spectrogram analysis
Rhythm formant trajectory identification
Hierarchical speech variety classification
Main innovations of the procedure
Terminology
RFA spectral analysis tool
English stress–pitch accent sequences
Mandarin Chinese lexical tone sequences
Morphophonology and rhythm: A hypothesis
Testing the hypothesis
Rhythm formant vector extraction
Distance based hierarchical clustering of AM rhythm formant frequency vectors
AM Rhythm formant frequency vector results
Rhythm formant trajectories and the ‘rhythms of rhythm’
English and German
Findings
Size and inhomogeneity of the data: A ‘stress test’
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call