An Energy-Efficient Programmable Manycore Accelerator for Personalized Biomedical Applications

Adwaya Kulkarni,Tinoosh Mohsenin,Houman Homayoun,Nasrin Attaran,Adam Page,Ali Jafari,Maria Malik

doi:10.1109/tvlsi.2017.2754272

Adwaya Kulkarni, Tinoosh Mohsenin + Show 5 more

Open Access

https://doi.org/10.1109/tvlsi.2017.2754272

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Wearable personalized health monitoring systems can offer a cost-effective solution for human health care. These systems must constantly monitor patients’ physiological signals and provide highly accurate, and quick processing and delivery of the vast amount of data within a limited power and area footprint. These personalized biomedical applications require sampling and processing multiple streams of physiological signals with a varying number of channels and sampling rates. The processing typically consists of feature extraction, data fusion, and classification stages that require a large number of digital signal processing (DSP) and machine learning (ML) kernels. In response to these requirements, in this paper, a tiny, energy-efficient, and domain-specific manycore accelerator referred to as power-efficient nanoclusters (PENC) is proposed to map and execute the kernels of these applications. Simulation results show that the PENC is able to reduce energy consumption by up to 80% and 25% for DSP and ML kernels, respectively, when optimally parallelized. In addition, we fully implemented three compute-intensive personalized biomedical applications, namely, multichannel seizure detection, multiphysiological stress detection, and standalone tongue drive system (sTDS), to evaluate the proposed manycore performance relative to commodity embedded CPU, graphical processing unit (GPU), and field-programmable gate array (FPGA)-based implementations. For these three case studies, the energy consumption and the performance of the proposed PENC manycore, when acting as an accelerator along with an Intel Atom processor as a host, are compared with the existing commercial off-the-shelf general-purpose, customizable, and programmable embedded platforms, including Intel Atom, Xilinx Artix-7 FPGA, and NVIDIA TK1 advanced RISC machine -A15 and K1 GPU system on a chip. For these applications, the PENC manycore is able to significantly improve throughput and energy efficiency by up to $1872{\times}$ and $276{\times} $ , respectively. For the most computational intensive application of seizure detection, the PENC manycore is able to achieve a throughput of 15.22 giga-operations-per-second (GOPs), which is a $14{\times} $ improvement in throughput over custom FPGA solution. For stress detection, the PENC achieves a throughput of 21.36 GOPs and an energy efficiency of 4.23 GOP/J, which is $14.87{\times} $ and $2.28{\times} $ better over FPGA implementation, respectively. For the sTDS application, the PENC improves a throughput by $5.45{\times} $ and an energy efficiency by $2.37{\times} $ over FPGA implementation.

Full Text

Accepted Version

View

Published Version

Check institute access

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Very Large Scale Integration (VLSI) Systems	Publication Date: Jan 1, 2018
Citations: 20	License type: publisher-specific-oa

R Discovery Prime

An Energy-Efficient Programmable Manycore Accelerator for Personalized Biomedical Applications

Abstract

Accepted Version

Published Version

Talk to us

Similar Papers

More From: IEEE Transactions on Very Large Scale Integration (VLSI) Systems

Lead the way for us

Similar Papers

Low-Power Manycore Accelerator for Personalized Biomedical Applications
Adam Page ... Nasrin Attaran
-
Adam Page, et. al.Adam Page ... Nasrin Attaran
18 May 2016
18 May 2016

FPGA, GPU, and CPU implementations of Jacobi algorithm for eigenanalysis
Mustafa U Torun ... Ali N Akansu
Journal of Parallel and Distributed Computing | VOL. 96
Mustafa U Torun, et. al.Mustafa U Torun ... Ali N Akansu
31 May 2016
Journal of Parallel and Distributed Computing | VOL. 96

Embedded Low-Power Processor for Personalized Stress Detection
Nasrin Attaran ... Abhilash Puranik
IEEE Transactions on Circuits and Systems II: Express Briefs | VOL. 65
Nasrin Attaran, et. al.Nasrin Attaran ... Abhilash Puranik
01 Dec 2018
IEEE Transactions on Circuits and Systems II: Express Briefs | VOL. 65

Highly Parameterized K-means Clustering on FPGAs: Comparative Results with GPPs and GPUs
Hanaa M Hussain ... Huseyin Seker
-
Hanaa M Hussain, et. al.Hanaa M Hussain ... Huseyin Seker
01 Nov 2011
01 Nov 2011

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

An Energy-Efficient Programmable Manycore Accelerator for Personalized Biomedical Applications

Abstract

Accepted Version

Published Version

Talk to us

Similar Papers

More From: IEEE Transactions on Very Large Scale Integration (VLSI) Systems