Replay Speech Answer-Sheet Detection on Intelligent Language Learning System Based on Power Spectrum Decomposition

Qingzhu Wu,Zhengyu Zhu,Shaowei Xiong

doi:10.1109/access.2021.3098058

Abstract

Replay speech answer-sheet detection is an urgent problem to be solved for intelligent language learning system. Traditional features used in replay speech detection are often extracted from power spectrum. However, power spectrum may not be the optimal spectrum to extract feature for replay speech answer-sheet detection because it doesn't consider the characteristic of replay speech. In order to solve this limitation, this paper proposes a method of power spectrum decomposition for replay speech answer-sheet detection on intelligent language learning system. Log frame-wise normalization spectrum (LFNS) and log spectral energy (LSE) which consider the characteristic of replay speech, are obtained by decomposing log power spectrum based on constant-Q transform. Next, the other two features are obtained at the base of LFNS and LSE. The first is constant-Q normalization octave coefficients (CNOC) which is obtained by combining LFNS and octave subband transform. The second is CNOC-LSE that is obtained by combining CNOC and LSE. Then LFNS, CNOC and CNOC-LSE are fed into frame- and utterance-based neural networks. Experimental results show that the proposed LFNS can outperform the conventional log power spectrum, CNOC and CNOC-LSE can perform better than most of commonly used features. We found that utterance-based neural network outperforms frame-based neural network with the same inputs. In addition, handcrafted features give worse performance than corresponding spectrum for the utterance-based neural network while the opposite conclusion can be obtained for the frame-based neural network.

Highlights

Due to the development of deep learning, artificial intelligence (AI) technology has been applied in language teaching and learning recent years
In order to improve the performance of replay speech answer-sheet detection, this paper proposes a method to decompose power spectrum in log scale into log frame-wise normalization spectrum (LFNS) and log spectral energy (LSE), both of them consider the characteristic of replay speech
In order to detect replay speech answer-sheet, on the basis of considering the characteristic of replay speech answer, we propose a method to improve the performance of replay speech answer-sheet detection by decomposing log power spectrum into Log frame-wise normalization spectrum (LFNS) and LSE in this paper

Summary

INTRODUCTION

Due to the development of deep learning, artificial intelligence (AI) technology has been applied in language teaching and learning recent years. That’s to say, if we can extract features from frame-wise normalization spectrum and energy information, the performance of replay speech answer-sheet detection can be significantly benefited. In order to improve the performance of replay speech answer-sheet detection, this paper proposes a method to decompose power spectrum in log scale into log frame-wise normalization spectrum (LFNS) and log spectral energy (LSE), both of them consider the characteristic of replay speech. It is the first contribution of the work.

POWER SPECTRUM DECOMPOSITION

EXPERIMENTS AND EVALUATIONS

Findings

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2021
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Replay Speech Answer-Sheet Detection on Intelligent Language Learning System Based on Power Spectrum Decomposition

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Reference-free quantification of EEG spectra: Combining current source density (CSD) and frequency principal components analysis (fPCA)
Craig E Tenke ... Jürgen Kayser
Clinical Neurophysiology | VOL. 116
Craig E Tenke, et. al.Craig E Tenke ... Jürgen Kayser
28 Oct 2005
Clinical Neurophysiology | VOL. 116

Device Features Based on Linear Transformation With Parallel Training Data for Replay Speech Detection
Longting Xu ... Xinyuan Qian
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 31
Longting Xu, et. al.Longting Xu ... Xinyuan Qian
01 Jan 2023
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 31

Enhancement of speech dynamics for voice activity detection using DNN
Suci Dwijayanti ... Masato Miyoshi
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2018
Suci Dwijayanti, et. al.Suci Dwijayanti ... Masato Miyoshi
12 Sep 2018
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2018

Effective dark matter power spectra inf(R)gravity
Jian-Hua He ... Baojiu Li
Physical Review D | VOL. 92
Jian-Hua He, et. al.Jian-Hua He ... Baojiu Li
05 Nov 2015
Physical Review D | VOL. 92

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Replay Speech Answer-Sheet Detection on Intelligent Language Learning System Based on Power Spectrum Decomposition

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access