The Swedish cue word men ‘but’ can mark the boundary between both different topic units as well as topic-internal units in spontaneous speech. The goal of this study is to see if these two functions of men can be distinguished on the basis of their local prosodic correlates and co-occurring lexical items. Men-tokens in spontaneous narrations were labelled as to their function, first using text-only data. The ‘strong’ tokens (categorized identically by all labellers) were subsequently seen to be clearly differentiated into two classes on the basis of related prosodic parameters and co-occurring lexical items. This distinction was, however, not found for the corresponding ‘weak’ tokens which were subsequently relabelled using both text and speech nor for the data-base as a whole. A test using a neural network trained using strong tokens was seen to be able to correctly categorize 90% of the strong men-tokens as to their associated boundary-type (topic-shift vs. topic-internal). The results show that cue words along with their prosodic correlates and co-occurring lexical items constitute a constellation of important information for understanding how segmentation of spoken discourse is produced and understood.
Read full abstract