Privacy-Sensitive Audio Features for Speech/Nonspeech Detection

Sree Hari Krishnan Parthasarathi,Mathew Magimai.-Doss,Daniel Gatica-Perez,Hervé Bourlard

doi:10.1109/tasl.2011.2151857

Abstract

The goal of this paper is to investigate features for speech/nonspeech detection (SND) having low linguistic information from the speech signal. Towards this, we present a comprehensive study of privacy-sensitive features for SND in multiparty conversations. Our study investigates three different approaches to privacy-sensitive features. These approaches are based on: 1) simple, instantaneous feature extraction methods; 2) excitation source information based methods; and 3) feature obfuscation methods such as local (within 130 ms) temporal averaging and randomization applied on excitation source information. To evaluate these approaches for SND, we use multiparty conversational meeting data of nearly 450 hours. On this dataset, we evaluate these features and benchmark them against standard spectral shape based features such as Mel frequency perceptual linear prediction (MFPLP). Fusion strategies combining excitation source with simple features show that comparable performance can be obtained in both close-talking and far-field microphone scenarios. As one way to objectively evaluate the notion of privacy, we conduct phoneme recognition studies on TIMIT. While excitation source features yield phoneme recognition accuracies in between the simple features and the MFPLP features, obfuscation methods applied on the excitation features yield low phoneme accuracies in conjunction with SND performance comparable to that of MFPLP features.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Privacy-Sensitive Audio Features for Speech/Nonspeech Detection

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Audio, Speech, and Language Processing

Lead the way for us

Journal: IEEE Transactions on Audio, Speech, and Language Processing	Publication Date: Nov 1, 2011
Citations: 35

Similar Papers

Speaker change detection using excitation source and vocal tract system information
Mousmita Sarma ... Sree Nilendra Gadre
-
Mousmita Sarma, et. al.Mousmita Sarma ... Sree Nilendra Gadre
01 Feb 2015
01 Feb 2015

Characterization and recognition of emotions from speech using excitation source information
Sreenivasa Rao Krothapalli ... Shashidhar G Koolagudi
International Journal of Speech Technology | VOL. 16
Sreenivasa Rao Krothapalli, et. al.Sreenivasa Rao Krothapalli ... Shashidhar G Koolagudi
05 Sep 2012
International Journal of Speech Technology | VOL. 16

End Point Detection Using Speech-Specific Knowledge for Text-Dependent Speaker Verification
Ramesh K Bhukya ... S R Mahadeva Prasanna
Circuits, Systems, and Signal Processing | VOL. 37
Ramesh K Bhukya, et. al.Ramesh K Bhukya ... S R Mahadeva Prasanna
04 May 2018
Circuits, Systems, and Signal Processing | VOL. 37

Effective use of combined excitation source and vocal-tract information for speaker recognition tasks
Krishna Dutta ... Debadatta Pati
International Journal of Speech Technology | VOL. 21
Krishna Dutta, et. al.Krishna Dutta ... Debadatta Pati
29 Oct 2018
International Journal of Speech Technology | VOL. 21

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Privacy-Sensitive Audio Features for Speech/Nonspeech Detection

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Audio, Speech, and Language Processing