Hyper-Ap: Enhancing Associative Processing Through A Full-Stack Optimization

Yue Zha,Jing Li

doi:10.1109/isca45697.2020.00074

Abstract

Associative processing (AP) is a promising PIM paradigm that overcomes the von Neumann bottleneck (memory wall) by virtue of a radically different execution model. By decomposing arbitrary computations into a sequence of primitive memory operations (i.e., search and write), AP’s execution model supports concurrent SIMD computations in-situ in the memory array to eliminate the need for data movement. This execution model also provides a native support for flexible data types and only requires a minimal modification on the existing memory design (low hardware complexity). Despite these advantages, the execution model of AP has two limitations that substantially increase the execution time, i.e., 1) it can only search a single pattern in one search operation and 2) it needs to perform a write operation after each search operation. In this paper, we propose the Highly Performant Associative Processor (Hyper- AP) to fully address the aforementioned limitations. The core of Hyper- AP is an enhanced execution model that reduces the number of search and write operations needed for computations, thereby reducing the execution time. This execution model is generic and improves the performance for both CMOS-based and RRAM-based AP, but it is more beneficial for the RRAMbased AP due to the substantially reduced write operations. We then provide complete architecture and micro-architecture with several optimizations to efficiently implement Hyper-AP. In order to reduce the programming complexity, we also develop a compilation framework so that users can write C-like programs with several constraints to run applications on Hyper- AP. Several optimizations have been applied in the compilation process to exploit the unique properties of Hyper- AP. Our experimental results show that, compared with the recent work IMP, Hyper- AP achieves up to 54×/4.4× better power-/area-efficiency for various representative arithmetic operations. For the evaluated benchmarks, Hyper-AP achieves 3.3× speedup and 23.8× energy reduction on average compared with IMP. Our evaluation also confirms that the proposed execution model is more beneficial for the RRAM-based AP than its CMOS-based counterpart.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Hyper-Ap: Enhancing Associative Processing Through A Full-Stack Optimization

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

LETT: An Execution Model for Distributed Real-Time Systems
Wojciech Baron ... Anna Arestova
-
Wojciech Baron, et. al.Wojciech Baron ... Anna Arestova
01 Sep 2021
01 Sep 2021

Device-centric adaptive data stream management and offloading for analytics applications in future internet architectures
Muhammad Habib Ur Rehman ... Davor Svetinovic
Future Generation Computer Systems | VOL. 114
Muhammad Habib Ur Rehman, et. al.Muhammad Habib Ur Rehman ... Davor Svetinovic
31 Jul 2020
Future Generation Computer Systems | VOL. 114

Will computing in memory become a new dawn of associative processors?
Leonid Yavits
Memories - Materials, Devices, Circuits and Systems | VOL. 4
Leonid YavitsLeonid Yavits
27 Feb 2023
Memories - Materials, Devices, Circuits and Systems | VOL. 4

Scalable and Efficient Associative Processor Solution to Guarantee Real-Time Requirements for Air Traffic Control Systems
Mike Yuan ... Will Meilander
-
Mike Yuan, et. al.Mike Yuan ... Will Meilander
01 May 2012
01 May 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Hyper-Ap: Enhancing Associative Processing Through A Full-Stack Optimization

Abstract

Talk to us

Similar Papers