Abstract
Current trends towards on-edge computing on smart portable devices requires ultra-low power circuits to be able to make feature extraction and classification tasks of patterns. This manuscript proposes a novel approach for feature extraction operations in speech recognition/voice activity detection tasks suitable for portable devices. Whereas conventional approaches are based on either completely analog or digital structures, we propose a “hybrid” approach by means of voltage-controlled-oscillators. Our proposal makes use of a bank a band-pass filters implemented with ring-oscillators to extract the features (energy within different frequency bands) of input audio signals and digitize them. Afterwards, these data will input a digital classification stage such as a neural network. Ring-oscillators are structures with a digital nature, which makes them highly scalable with the possibility of designing them with minimum length devices. Additionally, due to their inherent phase integration, low-frequency band-pass filters can be implemented without large capacitors. Consequently, we strongly benefit from power consumption and area savings. Finally, our proposal may incorporate the analog-to-digital converter into the structure of the own features extractor circuit to make the full conversion of the raw data when triggered. This supposes a unique advantage with respect to other approaches. The architecture is described and proposed at system-level, along with behavioral simulations made to check whether the performance is the expected one or not. Then the structure is designed with a 65-nm CMOS process to estimate the power consumption and area on a silicon implementation. The results show that our solution is very promising in terms of occupied area with a competitive power consumption in comparison to other state-of-the-art solutions.
Highlights
High computing capability of portable devices has made possible the implementation over them of voice user interfaces such as speech recognition or keyword spotting [1,2]
According to [15], an integrator can be built with a pulse frequency modulator (PFM), composed of a VCO, and an asynchronous digital counter
Making use of VCO-based analog-to-digital converter (ADC) filters we propose a hybrid solution between completely analog and completely digital architectures, leading to extraordinary area savings and competitive power consumption
Summary
High computing capability of portable devices has made possible the implementation over them of voice user interfaces such as speech recognition or keyword spotting [1,2]. Afterwards, a classification stage, such as a feed-forward neural network or a decision tree, decides whether the data correspond to the human voice or not The use of this architecture in portable devices is restricted by the power consumption of digital circuits, which may need high-capacity batteries [9]. The advantage of the analog approach is the enormous reduction of the power consumption in the features extractor circuit, making them suitable for ultra-low power applications [8] These architectures often make use of large capacitors due to the low-frequency filters with high time constants needed. System complexity and area are increased even more In this manuscript, we propose a “hybrid” approach for VADs applications that makes use of voltage-controlled-oscillators based ADCs (VCO-based ADCs) to perform the feature extraction of audio signals [10].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.