Abstract
Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named “iMemComp”, where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped “iMemComp” with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on “iMemComp” can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.
Highlights
(CMOS) transistors, “iMemComp” is built upon resistive switching (RS) devices, a kind of sandwich-like emerging device whose resistance can be modulated via applying external voltage (Supplementary Fig. S1)[8,9,10,11]
In a modern central processing unit (CPU) where CMOS transistors serve as digital switches for computing, information flows in a volatile manner since the voltage at nodes cannot be kept without global voltage supply
Owing to the unique features introduced by RS devices into this architecture, large-scale computing tasks are no longer conducted by CMOS processors with a large amount of energy-consuming repetition
Summary
From a perspective of power dissipation, today’s CPU performance is around the order of 100 Giga-operands/sec, and a 30 times increase over the 10 years would boost this performance to 3. RS-based 32-bit adders can achieve 60.3% reduction in average power dissipation per cycle (Fig. 5a) compared with CMOS circuits after 105 cycles, which is a small amount of computation[4]. Theoretical analysis indicates the possibility of femtosecond-level computing speed under a higher degree of parallelism (Fig. 5b), which may require robust design of large-scale crossbar RS arrays. From a perspective of circuit area, the highly compact crossbar structure of “iMemComp” systems eliminates the complex routing and layout that are necessary for CMOS-based logic circuits, and is able to achieve the smallest possible cell area (4 F2/cell), where F is the minimum feature size allowed by lithography. The experimentally demonstrated nonvolatile logic and memory features together with superior performance in power, speed and area have proven the feasibility of high-density, massively parallel, ultra-low-power information processing systems with memory and logic unified by single-type devices. The important lesson we have learned from this research, is that one should explore the use of novel device properties in architectural innovations and fully exploit the computational potential of emerging technologies for the increasing demand of our information society
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.