Abstract

AbstractIn this study, we investigate the process of generating single-sentence representations for the purpose of Dialogue Act (DA) classification, including several aspects of text pre-processing and input representation which are often overlooked or underreported within the literature, for example, the number of words to keep in the vocabulary or input sequences. We assess each of these with respect to two DA-labelled corpora, using a range of supervised models, which represent those most frequently applied to the task. Additionally, we compare context-free word embedding models with that of transfer learning via pre-trained language models, including several based on the transformer architecture, such as Bidirectional Encoder Representations from Transformers (BERT) and XLNET, which have thus far not been widely explored for the DA classification task. Our findings indicate that these text pre-processing considerations do have a statistically significant effect on classification accuracy. Notably, we found that viable input sequence lengths, and vocabulary sizes, can be much smaller than is typically used in DA classification experiments, yielding no significant improvements beyond certain thresholds. We also show that in some cases the contextual sentence representations generated by language models do not reliably outperform supervised methods. Though BERT, and its derivative models, do represent a significant improvement over supervised approaches, and much of the previous work on DA classification.

Highlights

  • The concept of a Dialogue Act (DA) originated from John Austin’s ‘illocutionary act’ theory (Austin 1962) and was later developed by John Searle (1969), as a method of defining the semantic content and communicative function of a single utterance of dialogue

  • Sentence encoding and DA classification we describe the various components of sentence encoding process, with respect to the DA classification task, and provide details of each aspect investigated in this work

  • This work has explored numerous factors which may affect the task of sentence encoding for the purpose of DA classification

Read more

Summary

Introduction

The concept of a Dialogue Act (DA) originated from John Austin’s ‘illocutionary act’ theory (Austin 1962) and was later developed by John Searle (1969), as a method of defining the semantic content and communicative function of a single utterance of dialogue. Speaker A uses ‘Okay’ as confirmation that a response has been heard and understood As such, it may be difficult, or impossible, to determine the communicative intent of a single-dialogue utterance and, including contextual information results in superior performance over single-sentence approaches. The Switchboard corpus contains ∼22, 000 unique words (this varies depending on certain pre-processing decisions, see Section 3.1), yet different studies have elected to use vocabulary sizes in the range of 10,000 to 20,000 words (Ji et al 2016; Lee and Dernoncourt 2016; Kumar et al 2017; Li et al 2018; Chen et al 2018), while Wan et al (2018) kept only words that appeared more than once within the corpus. We evaluate the impact of different sequence lengths on classification results (see Section 4.1.3)

Embeddings
Encoder models
Models
Evaluation procedure
Results and discussion
Attentional models
Supervised models
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call