Abstract
Existing algorithms of speech-based deception detection are severely restricted by the lack of sufficient number of labelled data. However, a large amount of easily available unlabelled data has not been utilized in reality. To solve this problem, this paper proposes a semi-supervised additive noise autoencoder model for deception detection. This model updates and optimizes the semi-supervised autoencoder and it consists of two layers of encoder and decoder, and a classifier. Firstly, it changes the activation function of the hidden layer in network according to the characteristics of the deception speech. Secondly, in order to prevent over-fitting during training, the specific ratio dropout is added at each layer cautiously. Finally, we directly connected the supervised classification task in the output of encoder to make the network more concise and efficient. Using the feature set specified by the INTERSPEECH 2009 Emotion Challenge, the experimental results on Columbia-SRI-Colorado (CSC) corpus and our own deception corpus show that the proposed model can achieve more advanced performance than other alternative methods with only a small amount of labelled data.
Highlights
Proposed model semisupervised additive noise autoencoder (SS-ANE), supervised autoencoder (SS-AE) and other models which frequently used in the domain of speech-based deception detection
As can be seen from the experimental results, our model can achieve the most advanced performance compared with other models when the same number of labelled data provided. our model obtains 59.52% and 62.78% accuracy rate with only 500 and 1000 labelled data one the CSC corpus, which is comparable with the best accuracy rate 62.88% obtained by Deep Belief Network (DBN) that uses all labelled data for training
DBN and Deep Boltzmann Machines (DBM) adopt the method of first unsupervised learning and using supervised learning to fine-tune, so that a potential conflict of interest between them affects classification, which further proves the effectiveness of our method of conducting supervised learning and unsupervised learning simultaneously
Summary
This paper mainly focuses on how to utilize labelled data and unlabelled data more efficient, that is, applies semi-supervised learning to data to achieve better deception detection work. Semi-supervised ladder networks and variational autoencoders [13, 16] were introduced, which achieved satisfactory accuracy with only a Autoencoder for deception detection few hundred labels in image classification. Deng et al proposed semi-supervised autoencoder (SS-AE) for speech emotion recognition [14], which combines the autoencoder and classifier, and categorizes unlabelled data into an additional class, excellent performance of this model can be showed by the results on the different data sets.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.