Abstract

An efficient algorithm and its corresponding VLSI architecture for the critical-band transform (CBT) are developed to approximate the critical-band filtering of the human ear. The CBT consists of a constant-bandwidth transform in the lower frequency range and a Brown constant-Q transform (CQT) in the higher frequency range. The corresponding VLSI architecture is proposed to achieve significant power efficiency by reducing the computational complexity, using pipeline and parallel processing, and applying the supply voltage scaling technique. A 21-band Bark scale CBT processor with a sampling rate of 16 kHz is designed and simulated. Simulation results verify its suitability for performing short-time spectral analysis on speech. It has a better fitting on the human ear critical-band analysis, significantly fewer computations, and therefore is more energy-efficient than other methods. With a 0.35 µm CMOS technology, it calculates a 160-point speech in 4.99 milliseconds at 234 kHz. The power dissipation is 15.6 µW at 1.1 V. It achieves 82.1% power reduction as compared to a benchmark 256-point FFT processor.

Highlights

  • Spectral analysis is one of the most fundamental operations in the field of acoustic and speech signal processing

  • We evaluate the efficiency of the architecture by designing and simulating the 21-band critical-band transform (CBT) processor [13], and comparing it against a benchmark 256-point fast Fourier transform (FFT) processor we designed

  • Based on the observation of the critical-band scale depicted in Section 1, a novel critical-band transform algorithm is proposed to approximate the critical-band filtering of the human ear

Read more

Summary

INTRODUCTION

Spectral analysis is one of the most fundamental operations in the field of acoustic and speech signal processing. In the past two decades, various schemes to implement critical-band analysis [4,5,6,7,8,9,10] have been proposed for speech applications These methods can be classified into four main approaches: (i) direct digital implementation of the criticalband filterbank, (ii) FFT method, (iii) constant-Q transform (CQT) method, and (iv) wavelet packet transform (WPT) method. A new approach based on the fast orthogonal WPT (OWPT) was proposed for the applications of speech coding, speech enhancement, and speech recognition [9, 10] This method uses a tree structure to decompose the input speech signal into the approximated critical bands.

THE PROPOSED ALGORITHM OF THE CRITICAL-BAND TRANSFORM
SHORT-TIME CRITICAL-BAND ANALYSIS ON SPEECH
THE VLSI ARCHITECTURE OF THE CRITICAL-BAND TRANSFORM
The VLSI architecture of the critical-band transform
Computation complexity and memory access
CIRCUITS SIMULATION RESULTS AND ANALYSIS
CONCLUSIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call