Shennong: A Python toolbox for audio speech features extraction.

Mathieu Bernard,Emmanuel Dupoux,Julien Karadayi,Maxime Poli

doi:10.3758/s13428-022-02029-6

Abstract

We introduce Shennong, a Python toolbox and command-line utility for audio speech features extraction. It implements a wide range of well-established state-of-the-art algorithms: spectro-temporal filters such as Mel-Frequency Cepstral Filterbank or Predictive Linear Filters, pre-trained neural networks, pitch estimators, speaker normalization methods, and post-processing algorithms. Shennong is an open source, reliable and extensible framework built on top of the popular Kaldi speech processing library. The Python implementation makes it easy to use by non-technical users and integrates with third-party speech modeling and machine learning tools from the Python ecosystem. This paper describes the Shennong software architecture, its core components, and implemented algorithms. Then, three applications illustrate its use. We first present a benchmark of speech features extraction algorithms available in Shennong on a phone discrimination task. We then analyze the performances of a speaker normalization model as a function of the speech duration used for training. We finally compare pitch estimation algorithms on speech under various noise conditions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Shennong: A Python toolbox for audio speech features extraction.

Abstract

Talk to us

Similar Papers

More From: Behavior Research Methods

Lead the way for us

Journal: Behavior Research Methods	Publication Date: Feb 7, 2023
Citations: 1

Similar Papers

Improved steelpan pitch detection through audio feature extraction and machine learning
Colin Malloy
The Journal of the Acoustical Society of America | VOL. 149
Colin MalloyColin Malloy
01 Apr 2021
The Journal of the Acoustical Society of America | VOL. 149

Emotion recognition using semi-supervised feature selection with speaker normalization
Yaxin Sun ... Guihua Wen
International Journal of Speech Technology | VOL. 18
Yaxin Sun, et. al.Yaxin Sun ... Guihua Wen
04 Feb 2015
International Journal of Speech Technology | VOL. 18

Automatic detection of the second subglottal resonance and its application to speaker normalization
Shizhen Wang ... Steven M Lulich
The Journal of the Acoustical Society of America | VOL. 126
Shizhen Wang, et. al.Shizhen Wang ... Steven M Lulich
01 Dec 2009
The Journal of the Acoustical Society of America | VOL. 126

Towards an Intelligent Acoustic Front End for Automatic Speech Recognition: Built-in Speaker Normalization
Umit H Yapanel ... John H L Hansen
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2008
Umit H Yapanel, et. al.Umit H Yapanel ... John H L Hansen
01 Jan 2008
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Shennong: A Python toolbox for audio speech features extraction.

Abstract

Talk to us

Similar Papers

More From: Behavior Research Methods