A semi-supervised high-quality pseudo labels algorithm based on multi-constraint optimization for speech deception detection

Huawei Tao,Hang Yu,Man Liu,Hongliang Fu,Chunhua Zhu,Yue Xie

doi:10.1016/j.csl.2023.101586

Abstract

Deep learning-based speech deception detection research relies on a large amount of labeled data. However, in the process of collecting speech deception detection data, the identification of truth and lies requires researchers to have a professional knowledge reserve, which greatly limits the number of annotated samples. Improving the accuracy of lie detection with insufficient annotation data is the focus of this study at this stage. In this paper, we propose a semi-supervised high-quality pseudo-label algorithm based on multi-constraint optimization (HQPL-MC) for speech deception detection. Firstly, the algorithm exploits the potential feature information of unlabeled data by using deep auto-encoder networks; secondly, it achieves entropy minimization with the help of the pseudo labeling technique to reduce the class overlap distribution of truth and deception data; finally, it improves the quality of pseudo labels by optimizing the unlabeled loss and reconstruction loss to further enhance the classification performance of the model when the labeled data is insufficient. We recorded an interview-style corpus by ourselves and used it in this paper for the experimental demonstration of the algorithm together with the Columbia/SRI/Colorado(CSC) corpus. The detection performance of the proposed algorithm is better than most state-of-the-art algorithms.

Full Text