NISQE: Non-Intrusive Speech Quality Evaluator Based on Natural Statistics of Mean Subtracted Contrast Normalized Coefficients of Spectrogram

Shakeel Zafar,Jihyoung Ryu,Muhammad Majid,Imran Fareed Nizami,Mobeen Ur Rehman

doi:10.3390/s23125652

Abstract

With the evolution in technology, communication based on the voice has gained importance in applications such as online conferencing, online meetings, voice-over internet protocol (VoIP), etc. Limiting factors such as environmental noise, encoding and decoding of the speech signal, and limitations of technology may degrade the quality of the speech signal. Therefore, there is a requirement for continuous quality assessment of the speech signal. Speech quality assessment (SQA) enables the system to automatically tune network parameters to improve speech quality. Furthermore, there are many speech transmitters and receivers that are used for voice processing including mobile devices and high-performance computers that can benefit from SQA. SQA plays a significant role in the evaluation of speech-processing systems. Non-intrusive speech quality assessment (NI-SQA) is a challenging task due to the unavailability of pristine speech signals in real-world scenarios. The success of NI-SQA techniques highly relies on the features used to assess speech quality. Various NI-SQA methods are available that extract features from speech signals in different domains, but they do not take into account the natural structure of the speech signals for assessment of speech quality. This work proposes a method for NI-SQA based on the natural structure of the speech signals that are approximated using the natural spectrogram statistical (NSS) properties derived from the speech signal spectrogram. The pristine version of the speech signal follows a structured natural pattern that is disrupted when distortion is introduced in the speech signal. The deviation of NSS properties between the pristine and distorted speech signals is utilized to predict speech quality. The proposed methodology shows better performance in comparison to state-of-the-art NI-SQA methods on the Centre for Speech Technology Voice Cloning Toolkit corpus (VCTK-Corpus) with a Spearman’s rank-ordered correlation constant (SRC) of 0.902, Pearson correlation constant (PCC) of 0.960, and root mean squared error (RMSE) of 0.206. Conversely, on the NOIZEUS-960 database, the proposed methodology shows an SRC of 0.958, PCC of 0.960, and RMSE of 0.114.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

NISQE: Non-Intrusive Speech Quality Evaluator Based on Natural Statistics of Mean Subtracted Contrast Normalized Coefficients of Spectrogram

Abstract

Talk to us

Similar Papers

More From: Sensors

Lead the way for us

Journal: Sensors	Publication Date: Jun 16, 2023
License type: CC BY 4.0

Similar Papers

Non-intrusive objective speech quality assessment using a combination of MFCC, PLP and LSF features
Rajesh Kumar Dubey ... Arun Kumar
-
Rajesh Kumar Dubey, et. al.Rajesh Kumar Dubey ... Arun Kumar
01 Dec 2013
01 Dec 2013

Non-intrusive speech quality assessment using context-aware neural networks
Rahul Kumar Jaiswal ... Rajesh Kumar Dubey
International Journal of Speech Technology | VOL. 25
Rahul Kumar Jaiswal, et. al.Rahul Kumar Jaiswal ... Rajesh Kumar Dubey
23 Oct 2022
International Journal of Speech Technology | VOL. 25

Non-intrusive speech quality assessment: A survey
Kailai Shen ... Zhe Ye
Neurocomputing | VOL. 580
Kailai Shen, et. al.Kailai Shen ... Zhe Ye
06 Mar 2024
Neurocomputing | VOL. 580

MetricNet: Towards Improved Modeling For Non-Intrusive Speech Quality Assessment
Meng Yu ... Chunlei Zhang
-
Meng Yu, et. al.Meng Yu ... Chunlei Zhang
30 Aug 2021
30 Aug 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

NISQE: Non-Intrusive Speech Quality Evaluator Based on Natural Statistics of Mean Subtracted Contrast Normalized Coefficients of Spectrogram

Abstract

Talk to us

Similar Papers

More From: Sensors