Abstract

The key challenges of manycore systems are the large amount of memory and high bandwidth required to run many applications. Three-dimesnional integrated on-chip memory is a promising candidate for addressing these challenges. The advent of on-chip memory has provided new opportunities to rethink traditional memory hierarchies and their management. In this study, we propose a polymorphic memory as a hybrid approach when using on-chip memory. In contrast to previous studies, we use the on-chip memory as both a main memory (called M1 memory) and a Dynamic Random Access Memory (DRAM) cache (called M2 cache). The main memory consists of M1 memory and a conventional DRAM memory called M2 memory. To achieve high performance when running many applications on this memory architecture, we propose management techniques for the main memory with M1 and M2 memories and for polymorphic memory with dynamic memory allocations for many applications in a manycore system. The first technique is to move frequently accessed pages to M1 memory via hardware monitoring in a memory controller. The second is M1 memory partitioning to mitigate contention problems among many processes. Finally, we propose a method to use M2 cache between a conventional last-level cache and M2 memory, and we determine the best cache size for improving the performance with polymorphic memory. The proposed schemes are evaluated with the SPEC CPU2006 benchmark, and the experimental results show that the proposed approaches can improve the performance under various workloads of the benchmark. The performance evaluation confirms that the average performance improvement of polymorphic memory is 21.7%, with 0.026 standard deviation for the normalized results, compared to the previous method of using on-chip memory as a last-level cache.

Highlights

  • Owing to the recent emergence of manycore systems in computing architecture, it is possible to simultaneously run many applications such as rich multimedia and scientific calculations [1,2,3,4].Manycore systems are specialized multi-core processor-based systems designed for high-level parallel processing of data-intensive applications

  • To achieve high performance when running numerous applications on a manycore system with a hybrid memory architecture, we designed a hybrid memory management scheme called polymorphic memory, in which M1 is dynamically allocated according to the state of the application running in the manycore system, and the allocated M1 is further divided into the Dynamic Random Access Memory (DRAM) cache and flat address region

  • The Pintool can used for profiling and performance evaluation for architecture-specific details, so it is suitable for testing the dynamic cache and memory system with 3D stacked DRAM

Read more

Summary

Introduction

Owing to the recent emergence of manycore systems in computing architecture, it is possible to simultaneously run many applications such as rich multimedia and scientific calculations [1,2,3,4]. Not many existing studies have used M1 memory as a hybrid memory with a cache and flat address region; little studies have investigated dynamic M1 allocation for manycore systems. To achieve high performance when running numerous applications on a manycore system with a hybrid memory architecture, we designed a hybrid memory management scheme called polymorphic memory, in which M1 is dynamically allocated according to the state of the application running in the manycore system, and the allocated M1 is further divided into the DRAM cache and flat address region. The first scheme is migration between M1 and M2 for a flat address region of the memory to enhance the memory throughput by placing frequently accessed data in high-bandwidth M1 via hardware monitoring in a memory controller.

Related Works
Polymorphic Memory System
Memory Management with Monitor
Page Migration between M1 and M2
M2 Cache Management Using Part of M1
Polymorphic Memory of M1 as Both Part of Memory and M2 Cache
Multi-Process Support Management
Simulation Environment
Analysis of Migration Overhead
Performance with a Single Workload
Performance with Multiple Workloads
Processes
Summary and Dicussion
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.