Coding Of Speech Signals Research Articles

The subject of this study is methods for improving the efficiency of semantic coding of speech signals. The purpose of this study is to develop a method for improving the efficiency of semantic coding of speech signals. Coding efficiency refers to the reduction of the information transmission rate with a given probability of error-free recognition of semantic features of speech signals, which will significantly reduce the required source bandwidth, thereby increasing the communication channel bandwidth. To achieve this goal, it is necessary to solve the following scientific tasks: (1) to investigate a known method for improving the efficiency of semantic coding of speech signals based on mel-frequency cepstral coefficients; (2) to substantiate the effectiveness of using the adaptive empirical wavelet transform in the tasks of multiple-scale analysis and semantic coding of speech signals; (3) to develop a method of semantic coding of speech signals based on adaptive empirical wavelet transform with further application of Hilbert spectral analysis and optimal thresholding; and (4) to perform an objective quantitative assessment of the increase in the efficiency of the developed method of semantic coding of speech signals in contrast to the existing method. The following scientific results were obtained during the study: a method of semantic coding of speech signals based on empirical wavelet transform is developed for the first time, which differs from existing methods by constructing a set of adaptive bandpass Meyer wavelet filters with further application of Hilbert spectral analysis to find the instantaneous amplitudes and frequencies of the functions of internal empirical modes, which will allow the identification of semantic features of speech signals and increase the efficiency of their coding; for the first time, it is proposed to use the method of adaptive empirical wavelet transform in the tasks of multiple-scale analysis and semantic coding of speech signals, which will increase the efficiency of spectral analysis by decomposing the high-frequency speech oscillation into its low-frequency components, namely internal empirical modes; the method of semantic coding of speech signals based on mel-frequency cepstral coefficients was further developed, but using the basic principles of adaptive spectral analysis with the help of empirical wavelet transform, which increases the efficiency of this method. Conclusions: We developed a method for semantic coding of speech signals based on empirical wavelet transform, which reduces the encoding rate from 320 to 192 bps and the required bandwidth from 40 to 24 Hz with a probability of error-free recognition of approximately 0.96 (96%) and a signal-to-noise ratio of 48 dB, according to which its efficiency is increased by 1.6 times as compared to the existing method. We developed an algorithm for semantic coding of speech signals based on empirical wavelet transform and its software implementation in the MATLAB R2022b programing language.

Read full abstract

The auditory pathway is an excellent system to study temporal aspects of neuronal processing. Unlike other sensory systems, temporal cues cover an extremely wide range of information: for sound localization, interaural time differences with a precision of tens of microseconds are extracted. Phase-locking of auditory nerve responses, which is related to the coding of the temporal fine structure, occurs from the lowest audible frequencies probably up to 3 kHz in humans. Amplitude modulations in speech signals are processed in the ms to tens of ms range. And finally, the energy of spoken speech itself is modulated with a frequency of about 4 Hz, corresponding to a syllable frequency in the order of few hundreds of ms. To extract temporal cues at all timescales, it is important to understand how temporal information is coded. We investigate temporal coding of speech signals using the methods of information theory and a model of the human inner ear. The model is based on a traveling-wave model, a nonlinear compression stage which mimics the function of the amplifier, a model of the sensory cells, the afferent synapse and spike generation (Sumner ) which we extended to replicate offset adaptation (Zhang). We used the action potentials of the auditory nerve to drive Hodgkin-Huxley-type point models of various neurons in the cochlear nucleus. In this investigation we only report data from onset neurons, which exhibit extraordinary fast membrane time-constants below 1 ms. Onset neurons are known for their precise temporal processing. They achieve precisely timed action potentials by coincidence detection: they fire only if at least 10% of the auditory nerve fibers which innervate them fire synchronously. With information theory, we analyzed the transmitted information rate coded in neural spike trains of modeled neurons in the cochlear nucleus for vowels. We found that onset neurons are able to code temporal information with sub-millisecond precision (<0.02 ms) across a wide range of characteristic frequencies. Temporal information is coded by precisely timed spikes per se, not only temporal fine structure. Moreover, the major portion of information (60%) is coded with a temporal precision from 0.2 to 4 ms. Enhancing the temporal resolution from 10 ms to 3 ms and from 3 ms to 0.3 ms is expected to increase the transmitted information by approximately twofold and 2.5 fold, respectively. In summary, our results provide quantitative insight into temporal processing strategies of neuronal speech processing. We conclude that coding of information in the time domain might be essential to complement the rate-place code, especially in adverse acoustical environments. Acknowledgments:Supported by within the Munich Bernstein Center for Computational Neuroscience by the German Federal Ministry of Education and Research (reference numbers 01GQ0441 and 01GQ0443).

Read full abstract

Coding Of Speech Signals Research Articles

Related Topics

Articles published on Coding Of Speech Signals

A method for extracting the semantic features of speech signal recognition based on empirical wavelet transform

Warped Linear Predictive Coding of Speech Signal of Processing

METHODS OF FACTORIAL CODING OF SPEECH SIGNALS

Efficient Remote Access System Based on Coded Speech Signals

Design and Comparison of Vector Quantization Codebooks for Narrowband Speech Coding

An Algorithm for Simple Differential Speech Coding Based on Backward Adaptation Technique

A subspace based progressive coding method for speech compression

Speech Signal Coding Using Forward Adaptive Quantization and Simple Transform Coding

Two-Channel Quadrature Mirror Filter Bank: An Overview

Cochlear Implant Speech Processing Using Wavelet Transform

Sparse Linear Prediction and Its Applications to Speech Processing

MODEL SINUSOIDA SECARA SEGMENTAL UNTUK PENGKODEAN SINYAL SUARA

Subband Coding of Speech Signals Using Decimation and Interpolation

Temporal precision of speech coded into nerve-action potentials

Application of Linear Prediction Coefficients Interpolation in Speech Signal Coding

Compression of surface EMG signals with algebraic code excited linear prediction

Application of the wavelet transform to the low‐bit‐rate speech coding system

Method for coding speech and music signals

Hammerstein Model for Speech Coding

Model-Based Speech Signal Coding Using Optimized Temporal Decomposition for Storage and Broadcasting Applications

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Coding Of Speech Signals Research Articles

Related Topics

Articles published on Coding Of Speech Signals

A method for extracting the semantic features of speech signal recognition based on empirical wavelet transform

Warped Linear Predictive Coding of Speech Signal of Processing

METHODS OF FACTORIAL CODING OF SPEECH SIGNALS

Efficient Remote Access System Based on Coded Speech Signals

Design and Comparison of Vector Quantization Codebooks for Narrowband Speech Coding

An Algorithm for Simple Differential Speech Coding Based on Backward Adaptation Technique

A subspace based progressive coding method for speech compression

Speech Signal Coding Using Forward Adaptive Quantization and Simple Transform Coding

Two-Channel Quadrature Mirror Filter Bank: An Overview

Cochlear Implant Speech Processing Using Wavelet Transform

Sparse Linear Prediction and Its Applications to Speech Processing

MODEL SINUSOIDA SECARA SEGMENTAL UNTUK PENGKODEAN SINYAL SUARA

Subband Coding of Speech Signals Using Decimation and Interpolation

Temporal precision of speech coded into nerve-action potentials

Application of Linear Prediction Coefficients Interpolation in Speech Signal Coding

Compression of surface EMG signals with algebraic code excited linear prediction

Application of the wavelet transform to the low‐bit‐rate speech coding system

Method for coding speech and music signals

Hammerstein Model for Speech Coding

Model-Based Speech Signal Coding Using Optimized Temporal Decomposition for Storage and Broadcasting Applications