Abstract
Speech is segmented into intonational units marked by prosodic boundaries. This segmentation is claimed to have important consequences on syntax, information structure and cognition. This work aims both to investigate the phonetic-acoustic parameters that guide the production and perception of prosodic boundaries, and to develop models for automatic detection of prosodic boundaries in male monological spontaneous speech of Brazilian Portuguese. Two samples were segmented into intonational units by two groups of trained annotators. The boundaries perceived by the annotators were tagged as either terminal or non-terminal. A script was used to extract 111 phonetic-acoustic parameters along speech signal in a right and left windows around the boundary of each phonological word. The extracted parameters comprise measures of (1) Speech rate and rhythm; (2) Standardized segment duration; (3) Fundamental frequency; (4) Intensity; (5) Silent pause. The script considers as prosodic boundary positions at which at least 50% of the annotators indicated a boundary of the same type. A training of models composed by the parameters extracted by the script was developed; these models, were then improved heuristically. The models were developed from the two samples and from the whole data, both using non-balanced and balanced data. Linear Discriminant Analysis algorithm was adopted to produce the models. The models for terminal boundaries show a much higher performance than those for non-terminal ones. In this paper we: (i) show the methodological procedures; (ii) analyze the different models; (iii) discuss some strategies that could lead to an improvement of our results.
Highlights
Speech is prosodically segmented into intonation units determined by prosodic boundaries
This work aims to investigate the acoustic-phonetic parameters that are involved in the production and guide the perception of prosodic boundaries, based on the hypothesis that they can initially be divided between two macrotypes: boundaries marking conclusion (TB) and boundaries marking continuation (NTB)
All models independent variables were selected heuristically, starting from the output of the BreakDescriptor software and trying to reach the best recognition with smaller numbers of measurements and false alarms. This means that we had to decide which model attained the best balance among these three goals. These were our steps so far: 1. we developed models for detecting terminal boundary (TB) and non-terminal boundary (NTB) using non-balanced data[10] extracted from sample I; 2. we validated these models on the data of sample II and on the full data; 3. we developed models from the non-balanced full data; 4. we developed models with balanced data from each sample and from the full data, and applied them to the different samples and to the full data
Summary
Speech is prosodically segmented into intonation units determined by prosodic boundaries. This work aims to investigate the acoustic-phonetic parameters that are involved in the production and guide the perception of prosodic boundaries, based on the hypothesis that they can initially be divided between two macrotypes: boundaries marking conclusion (TB) and boundaries marking continuation (NTB). It aims to develop automatic models for detecting prosodic boundaries in Brazilian Portuguese spontaneous speech. The models shown here consider two related criteria: the acoustic-phonetic parameters automatically extracted from the sound signal and the perception of trained annotators to perceive TB and NTB. This means that human perception is assumed to be the goal that the model should reflect. It analyzes these results and propose possible strategies for future steps
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.