A learnable parallel processing architecture towards unity of memory and computing.

H Li,Y Zhao,B Gao,P Huang,J Kang,X Liu,L Liu,H Ye,Z Chen

doi:10.1038/srep13330

H Li, Y Zhao + Show 7 more

Open Access

https://doi.org/10.1038/srep13330

Copy DOI

Abstract

Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named “iMemComp”, where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped “iMemComp” with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on “iMemComp” can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.

Highlights

(CMOS) transistors, “iMemComp” is built upon resistive switching (RS) devices, a kind of sandwich-like emerging device whose resistance can be modulated via applying external voltage (Supplementary Fig. S1)[8,9,10,11]
In a modern central processing unit (CPU) where CMOS transistors serve as digital switches for computing, information flows in a volatile manner since the voltage at nodes cannot be kept without global voltage supply
Owing to the unique features introduced by RS devices into this architecture, large-scale computing tasks are no longer conducted by CMOS processors with a large amount of energy-consuming repetition

Summary

Discussion

From a perspective of power dissipation, today’s CPU performance is around the order of 100 Giga-operands/sec, and a 30 times increase over the 10 years would boost this performance to 3. RS-based 32-bit adders can achieve 60.3% reduction in average power dissipation per cycle (Fig. 5a) compared with CMOS circuits after 105 cycles, which is a small amount of computation[4]. Theoretical analysis indicates the possibility of femtosecond-level computing speed under a higher degree of parallelism (Fig. 5b), which may require robust design of large-scale crossbar RS arrays. From a perspective of circuit area, the highly compact crossbar structure of “iMemComp” systems eliminates the complex routing and layout that are necessary for CMOS-based logic circuits, and is able to achieve the smallest possible cell area (4 F2/cell), where F is the minimum feature size allowed by lithography. The experimentally demonstrated nonvolatile logic and memory features together with superior performance in power, speed and area have proven the feasibility of high-density, massively parallel, ultra-low-power information processing systems with memory and logic unified by single-type devices. The important lesson we have learned from this research, is that one should explore the use of novel device properties in architectural innovations and fully exploit the computational potential of emerging technologies for the increasing demand of our information society

Methods

Author Contributions

Findings

Additional Information

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Scientific Reports	Publication Date: Aug 14, 2015
Citations: 74	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A learnable parallel processing architecture towards unity of memory and computing.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports

Lead the way for us

Similar Papers

A walkthrough of the emerging IoT paradigm: Visualizing inside functionalities, key features, and open issues
Abhishek Singh ... Sourabh Bharti
Journal of Network and Computer Applications | VOL. 143
Abhishek Singh, et. al.Abhishek Singh ... Sourabh Bharti
03 Jul 2019
Journal of Network and Computer Applications | VOL. 143

Energy Efficient Information Processing in Wireless Sensor Networks
Bang Wang ... Minghui Li
-
Bang Wang, et. al.Bang Wang ... Minghui Li
01 Jan 2009
01 Jan 2009

Fluoropolymer Passivation Enhanced Switching Endurance of MoS2 Memristors
Young-Woong Song ... J.-Y Kwon
Electrochemical Society Meeting Abstracts | VOL. MA2022-01
Young-Woong Song, et. al.Young-Woong Song ... J.-Y Kwon
07 Jul 2022
Electrochemical Society Meeting Abstracts | VOL. MA2022-01

A cloud platform for big IoT data analytics by combining batch and stream processing technologies
D M C Dissanayake ... K P N Jayasena
-
D M C Dissanayake, et. al.D M C Dissanayake ... K P N Jayasena
01 Sep 2017
01 Sep 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A learnable parallel processing architecture towards unity of memory and computing.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports