Abstract

Speech segments are broadly categorized as voiced (V), unvoiced (UV), and silence (S). The V/UV/S classification of speech segments plays an important role in many speech-based applications. In this paper, we propose a digital architecture for instantaneous V/UV/S classification of noise free speech segments. The proposed architecture uses the incoming samples of the speech segments to compute two popularly used time-domain-based speech parameters namely short-time energy (STE) and short-time average zero-crossing rate (STAZCR). These computed parameters are used along with pre-determined STE and STAZCR thresholds by the decision logic to classify the speech segments. The necessary hardware to perform on-the fly computations of the said parameters is realized using an algorithmic state-machine with datapath (ASMD). The decision logic is realized as a standalone unit, integrated with the ASMD. Further, the proposed architecture can be reconfigured to work with speech segments having variable lengths in powers of 2, upto 1024. The proposed architecture is prototyped on field-programmable gate array (FPGA) using Xilinx Zedboard Zynq Evaluation and Development Kit XC7Z020CLG484-1. The implementation results show that the proposed architecture utilizes minimal resources on FPGA fabric, and achieves maximum operating clock frequencies up to 185 MHz.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.