Comparing neural network architectures for non-intrusive speech quality prediction

Leif Førland Schill,Tobias Piechowiak,Clément Laroche,Pejman Mowlaee

doi:10.1016/j.specom.2024.103123

Abstract

Non-intrusive speech quality predictors evaluate speech quality without the use of a reference signal, making them useful in many practical applications. Recently, neural networks have shown the best performance for this task. Two such models in the literature are the convolutional neural network based DNSMOS and the bi-directional long short-term memory based Quality-Net, which were originally trained to predict subjective targets and intrusive PESQ scores, respectively. In this paper, these two architectures are trained on a single dataset, and used to predict the intrusive ViSQOL score. The evaluation is done on a number of test sets with a variety of mismatch conditions, including unseen speech and noise corpora, and common voice over IP distortions. The experiments show that the models achieve similar predictive ability on the training distribution, and overall good generalization to new noise and speech corpora. Unseen distortions are identified as an area where both models generalize poorly, especially DNSMOS. Our results also suggest that a pervasiveness of ambient noise in the training set can cause problems when generalizing to certain types of noise. Finally, we detail how the ViSQOL score can have undesirable dependencies on the reference pressure level and the voice activity level.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Comparing neural network architectures for non-intrusive speech quality prediction

Abstract

Talk to us

Similar Papers

More From: Speech Communication

Lead the way for us

Journal: Speech Communication	Publication Date: Aug 30, 2024
License type: cc-by

Similar Papers

Non-intrusive speech quality assessment using context-aware neural networks
Rahul Kumar Jaiswal ... Rajesh Kumar Dubey
International Journal of Speech Technology | VOL. 25
Rahul Kumar Jaiswal, et. al.Rahul Kumar Jaiswal ... Rajesh Kumar Dubey
23 Oct 2022
International Journal of Speech Technology | VOL. 25

Non-Intrusive Speech Quality with Different Time Scale
Mr Mohan Singh
IOSR Journal of Computer Engineering | VOL. 2
Mr Mohan SinghMr Mohan Singh
01 Jan 2012
IOSR Journal of Computer Engineering | VOL. 2

Towards a Non-Intrusive Context-Aware Speech Quality Model
Rahul Jaiswal ... Andrew Hines
-
Rahul Jaiswal, et. al.Rahul Jaiswal ... Andrew Hines
01 Jun 2020
01 Jun 2020

MetricNet: Towards Improved Modeling For Non-Intrusive Speech Quality Assessment
Meng Yu ... Chunlei Zhang
-
Meng Yu, et. al.Meng Yu ... Chunlei Zhang
30 Aug 2021
30 Aug 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comparing neural network architectures for non-intrusive speech quality prediction

Abstract

Talk to us

Similar Papers

More From: Speech Communication