Efficient classification of private memory blocks

Bhargavi R Upadhyay,Alberto Ros,Jalpa Shah

doi:10.1016/j.jpdc.2021.07.005

Abstract

Shared memory architectures are pervasive in the multicore technology era. Still, sequential and parallel applications use most of the data as private in a multicore system. Recent proposals using this observation and driven by a classification of private/shared memory data can reduce the coherence directory area or the memory access latency. The effectiveness of these proposals depends on the accuracy of the classification. The existing proposals perform the private/shared classification at page granularity, leading to a miss-classification and reducing the number of detected private memory blocks.We propose a mechanism able to accurately classify memory blocks using the existing translation lookaside buffers (TLB), which increases the effectiveness of proposals relying on a private/shared classification. Our experimental results show that the proposed scheme reduces L1 cache misses by 25% compared to a page-grain classification approach, which translates into an improvement in system performance by 8.0% with respect to a page-grain approach.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Efficient classification of private memory blocks

Abstract

Talk to us

Similar Papers

More From: Journal of Parallel and Distributed Computing

Lead the way for us

Journal: Journal of Parallel and Distributed Computing	Publication Date: Jul 21, 2021
Citations: 1

Similar Papers

TLB-based Block-Grain Classification of Private Data
Bhargavi R Upadhyay ... N.S Murty
-
Bhargavi R Upadhyay, et. al.Bhargavi R Upadhyay ... N.S Murty
01 Mar 2020
01 Mar 2020

Software and Hardware Co-designed Multi-level TLBs for Chip Multiprocessors
Xiaohui Zhang ... Guangqiang Chen
-
Xiaohui Zhang, et. al.Xiaohui Zhang ... Guangqiang Chen
01 Aug 2011
01 Aug 2011

TLB Improvements for Chip Multiprocessors
Daniel Lustig ... Abhishek Bhattacharjee
ACM Transactions on Architecture and Code Optimization | VOL. 10
Daniel Lustig, et. al.Daniel Lustig ... Abhishek Bhattacharjee
01 Apr 2013
ACM Transactions on Architecture and Code Optimization | VOL. 10

Synergistic TLBs for High Performance Address Translation in Chip Multiprocessors
Shekhar Srikantaiah ... Mahmut Kandemir
-
Shekhar Srikantaiah, et. al.Shekhar Srikantaiah ... Mahmut Kandemir
01 Dec 2010
01 Dec 2010

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficient classification of private memory blocks

Abstract

Talk to us

Similar Papers

More From: Journal of Parallel and Distributed Computing