Bi-LSTM-attention Based on ACNN Model for Disfluency Detection

Xin Tian,Xiuqing He,Bei Fang,Juhou He

doi:10.1088/1742-6596/2303/1/012018

Xin Tian, Xiuqing He + Show 2 more

https://doi.org/10.1088/1742-6596/2303/1/012018

Copy DOI

Abstract

Disfluencies are self-corrections in spontaneous speech, including filled pauses, repetitions, repairs and false starts. The task of disfluency detection is to identify these disfluencies phenomena in spoken speech and make them consistent with written text. Recent research has applied machine learning and deep learning approaches to disfluency recognition and classification. In addition, long distance dependence is one of the core issues in disfluency detection. In order to solve this problem, most existing approaches had combined plenty of hand-crafted features and words as input. However, hand-crafted model features need lots of time and energy. Some studies reduced the dependence on hand-crafted features, but they ignored the dependence between long sentences. In this article, we consider disfluency detection as a problem of sequence labeling, then apply Bi-LSTM and attention mechanisms to disfluency detection. In particular, on the basis of obtaining the dependency of rough copy by Auto-correlational neural network (ACNN), we improve the ACNN model to handle the long-term dependency, so that we can better capture dependency between words and words. In other words, our method can not only find the rough copy relationship between sentences without additional hand-crafted features, but also obtain the dependency relationship between long sentences. Experiments on commonly used English Switchboard test sets show that our approach achieves good performance compared to previous models that used texts only without other hand-crafted features.

Full Text