Sentence encoding for Dialogue Act classification

Nathan Duran,Jim Smith,Steve Battle

doi:10.1017/s1351324921000310

Abstract

AbstractIn this study, we investigate the process of generating single-sentence representations for the purpose of Dialogue Act (DA) classification, including several aspects of text pre-processing and input representation which are often overlooked or underreported within the literature, for example, the number of words to keep in the vocabulary or input sequences. We assess each of these with respect to two DA-labelled corpora, using a range of supervised models, which represent those most frequently applied to the task. Additionally, we compare context-free word embedding models with that of transfer learning via pre-trained language models, including several based on the transformer architecture, such as Bidirectional Encoder Representations from Transformers (BERT) and XLNET, which have thus far not been widely explored for the DA classification task. Our findings indicate that these text pre-processing considerations do have a statistically significant effect on classification accuracy. Notably, we found that viable input sequence lengths, and vocabulary sizes, can be much smaller than is typically used in DA classification experiments, yielding no significant improvements beyond certain thresholds. We also show that in some cases the contextual sentence representations generated by language models do not reliably outperform supervised methods. Though BERT, and its derivative models, do represent a significant improvement over supervised approaches, and much of the previous work on DA classification.

Highlights

The concept of a Dialogue Act (DA) originated from John Austin’s ‘illocutionary act’ theory (Austin 1962) and was later developed by John Searle (1969), as a method of defining the semantic content and communicative function of a single utterance of dialogue
Sentence encoding and DA classification we describe the various components of sentence encoding process, with respect to the DA classification task, and provide details of each aspect investigated in this work
This work has explored numerous factors which may affect the task of sentence encoding for the purpose of DA classification

Summary

Introduction

The concept of a Dialogue Act (DA) originated from John Austin’s ‘illocutionary act’ theory (Austin 1962) and was later developed by John Searle (1969), as a method of defining the semantic content and communicative function of a single utterance of dialogue. Speaker A uses ‘Okay’ as confirmation that a response has been heard and understood As such, it may be difficult, or impossible, to determine the communicative intent of a single-dialogue utterance and, including contextual information results in superior performance over single-sentence approaches. The Switchboard corpus contains ∼22, 000 unique words (this varies depending on certain pre-processing decisions, see Section 3.1), yet different studies have elected to use vocabulary sizes in the range of 10,000 to 20,000 words (Ji et al 2016; Lee and Dernoncourt 2016; Kumar et al 2017; Li et al 2018; Chen et al 2018), while Wan et al (2018) kept only words that appeared more than once within the corpus. We evaluate the impact of different sequence lengths on classification results (see Section 4.1.3)

Embeddings

Encoder models

Models

Evaluation procedure

Results and discussion

Attentional models

Supervised models

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Natural Language Engineering	Publication Date: Nov 2, 2021
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Sentence encoding for Dialogue Act classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Natural Language Engineering

Lead the way for us

Similar Papers

Impact of Using Bidirectional Encoder Representations from Transformers (BERT) Models for Arabic Dialogue Acts Identification
Alaa Joukhadar ... Nada Ghneim
Ingénierie des systèmes d information | VOL. 26
Alaa Joukhadar, et. al.Alaa Joukhadar ... Nada Ghneim
31 Oct 2021
Ingénierie des systèmes d information | VOL. 26

DIALOG ACT CLASSIFICATION USING ACOUSTIC AND DISCOURSE INFORMATION OF MAPTASK DATA
Fatema N Julia ... Atiq U Islam
International Journal of Computational Intelligence and Applications | VOL. 09
Fatema N Julia, et. al.Fatema N Julia ... Atiq U Islam
01 Dec 2010
International Journal of Computational Intelligence and Applications | VOL. 09

Using Context Information for Dialog Act Classification in DNN Framework
Yang Liu ... Yun Lei
-
Yang Liu, et. al.Yang Liu ... Yun Lei
01 Jan 2017
01 Jan 2017

Bidirectional encoders to state-of-the-art: a review of BERT and its transformative impact on natural language processing
Rajesh Gupta
Информатика. Экономика. Управление - Informatics. Economics. Management | VOL. 3
Rajesh GuptaRajesh Gupta
02 Mar 2024
Информатика. Экономика. Управление - Informatics. Economics. Management | VOL. 3

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Sentence encoding for Dialogue Act classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Natural Language Engineering