Abstract

Dialog act (DA) classification is useful to understand the intentions of a human speaker. An effective classification of DA can be exploited for realistic implementation of expert systems. In this work, we investigate DA classification using both acoustic and discourse information for HCRC MapTask data. We extract several different acoustic features and exploit these features using a Hidden Markov Model (HMM) network to classify acoustic information. For discourse feature extraction, we propose a novel parts-of-speech (POS) tagging technique that effectively reduces the dimensionality of discourse features. To classify discourse information, we exploit two classifiers such as a HMM and Support Vector Machine (SVM). We further obtain classifier fusion between HMM and SVM to improve discourse classification. Finally, we perform an efficient decision-level classifier fusion for both acoustic and discourse information to classify 12 different DAs in MapTask data. We obtain 65.2% and 55.4% DA classification rates using acoustic and discourse information, respectively. Furthermore, we obtain combined accuracy of 68.6% for DA classification using both acoustic and discourse information. These accuracy rates of DA classification are either comparable or better than previously reported results for the same data set. For average precision and recall, we obtain accuracy rates of 74.89% and 69.83%, respectively. Therefore, we obtain much better precision and recall rates for most of the classified DAs when compared to existing works on the same HCRC MapTask data set.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call