Synchronising Speech Segments with Musical Beats in Mandarin and English Singing

Cong Zhang,Jian Zhu

doi:10.21437/interspeech.2021-1841

Abstract

Generating synthesised singing voice with models trained on speech data has many advantages due to the models' flexibility and controllability. However, since the information about the temporal relationship between segments and beats are lacking in speech training data, the synthesised singing may sound off-beat at times. Therefore, the availability of the information on the temporal relationship between speech segments and music beats is crucial. The current study investigated the segment-beat synchronisation in singing data, with hypotheses formed based on the linguistics theories of P-centre and sonority hierarchy. A Mandarin corpus and an English corpus of professional singing data were manually annotated and analysed. The results showed that the presence of musical beats was more dependent on segment duration than sonority. However, the sonority hierarchy and the P-centre theory were highly related to the location of beats. Mandarin and English demonstrated cross-linguistic variations despite exhibiting common patterns.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Synchronising Speech Segments with Musical Beats in Mandarin and English Singing

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Where does the beat fall? Speech-beat alignment in Mandarin and English singing
Cong Zhang ... Charlotte A Slocombe
The Journal of the Acoustical Society of America | VOL. 148
Cong Zhang, et. al.Cong Zhang ... Charlotte A Slocombe
01 Oct 2020
The Journal of the Acoustical Society of America | VOL. 148

Co-channel speech separation using state-space reconstruction and sinusoidal modelling
Yasser Mahgoub
-
Yasser MahgoubYasser Mahgoub
04 Oct 2018
04 Oct 2018

Unsupervised speaker segmentation of multi-speaker speech data
Allen Louis Gorin
The Journal of the Acoustical Society of America | VOL. 125
Allen Louis GorinAllen Louis Gorin
01 Jan 2009
The Journal of the Acoustical Society of America | VOL. 125

Acoustic and Textual Data Augmentation for Improved ASR of Code-Switching Speech
Emre Yılmaz ... Henk Van Den Heuvel
-
Emre Yılmaz, et. al.Emre Yılmaz ... Henk Van Den Heuvel
02 Sep 2018
02 Sep 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Synchronising Speech Segments with Musical Beats in Mandarin and English Singing

Abstract

Talk to us

Similar Papers