High performance hardware architecture for singular spectrum analysis of Hankel tensors

Wei-Pei Huang,Bowen P.Y Kwan,Weiyang Ding,Biao Min,Ray C.C Cheung,Liqun Qi,Hong Yan

doi:10.1016/j.micpro.2018.10.004

Abstract

This paper presents a hardware architecture for singular spectrum analysis of Hankel tensors, including computation of tucker decomposition, tensor reconstruction and final Hankelization. In the proposed design, we explore two level of optimization. First, in algorithm level, we optimize the calculation process by exploiting the Hankel property to reduce the computation complexity and on-chip BRAM resource usage. Secondly, in hardware level, parallelism is explored for acceleration. Resource sharing is applied to reduce look-up tables (LUTs) usage. To enable flexibility, the number of processing elements (PEs) can be changed through parameter setting. Our proposed design is implemented on Field-Programmable Gate Arrays (FPGAs) to process third order tensors. Experiment results show that our design achieve a speed-up from 172 to 1004 compared with CPU implementation via Intel MKL and 5 to 40 compared with GPU implementation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

High performance hardware architecture for singular spectrum analysis of Hankel tensors

Abstract

Talk to us

Similar Papers

More From: Microprocessors and Microsystems

Lead the way for us

Journal: Microprocessors and Microsystems	Publication Date: Oct 10, 2018
Citations: 2

Similar Papers

Symbolic Loop Compilation for Tightly Coupled Processor Arrays
Michael Witterauf ... Jürgen Teich
ACM Transactions on Embedded Computing Systems | VOL. 20
Michael Witterauf, et. al.Michael Witterauf ... Jürgen Teich
29 Jul 2021
ACM Transactions on Embedded Computing Systems | VOL. 20

A highly efficient substitution matrix loader for pairwise sequence alignment
M.Nazrin Md Isa ... Khaled Benkrid
-
M.Nazrin Md Isa, et. al.M.Nazrin Md Isa ... Khaled Benkrid
01 Dec 2012
01 Dec 2012

A Fully Parameterized Virtual Coarse Grained Reconfigurable Array for High Performance Computing Applications
Amit Kulkarni ... Andreas Brokalakis
-
Amit Kulkarni, et. al.Amit Kulkarni ... Andreas Brokalakis
01 May 2016
01 May 2016

A Resource-Efficient Communication Architecture for Chip Multiprocessors on FPGAs
Xiaofang Wang ... Swetha Thota
Journal of Computer Science and Technology | VOL. 26
Xiaofang Wang, et. al.Xiaofang Wang ... Swetha Thota
01 May 2011
Journal of Computer Science and Technology | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

High performance hardware architecture for singular spectrum analysis of Hankel tensors

Abstract

Talk to us

Similar Papers

More From: Microprocessors and Microsystems