Architecture Aware Programming on Multi-Core Systems

M R,S.R Sathe

doi:10.14569/ijacsa.2011.020615

Abstract

In order to improve the processor performance, the response of the industry has been to increase the number of cores on the die. One salient feature of multi-core architectures is that they have a varying degree of sharing of caches at different levels. With the advent of multi-core architectures, we are facing the problem that is new to parallel computing, namely, the management of hierarchical caches. Data locality features need to be considered in order to reduce the variance in the performance for different data sizes. In this paper, we propose a programming approach for the algorithms running on shared memory multi-core systems by using blocking, which is a well-known optimization technique coupled with parallel programming paradigm, OpenMP. We have chosen the sizes of various problems based on the architectural parameters of the system like cache level, cache size, cache line size. We studied the cache optimization scheme on commonly used linear algebra applications – matrix multiplication (MM), Gauss-Elimination (GE) and LU Decomposition (LUD) algorithm.

Highlights

While microprocessor technology has delivered significant improvements in clock speed over the past decade, it has exposed a variety of other performance bottlenecks
We present the parallelization of matrix multiplication (MM), GE and LU Decomposition (LUD) algorithm on shared memory systems using OpenMP
For GE and LUD problems, we used the approach of 1D partitioning of the matrix among the cores and used OpenMP paradigm for distributing the work among number of www.ijacsa.thesai.org threads to be executed on various cores

Summary

INTRODUCTION

While microprocessor technology has delivered significant improvements in clock speed over the past decade, it has exposed a variety of other performance bottlenecks To alleviate these bottlenecks, microprocessor designers have explored alternate routes to cost effective performance gains. An important feature of these new architectures is the integration of large number of simple cores with software managed cache hierarchy with local storage. Offering these new architectures as general-purpose computation platforms creates number of problems, the most obvious one being programmability. It is essential that algorithms be designed to maximize data locality so as to best exploit the hierarchical cache structures.

COMPUTING PROBLEM

RELATED WORK

IMPLEMENTATION

Architecture Aware Parallelization

Determining Block Size

Effect of Cache Line Size

LU Decomposition

EXPERIMENTAL SETUP & RESULTS

PERFORMANCE ANALYSIS

CONCLUSION & FUTURE WORK

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Advanced Computer Science and Applications	Publication Date: Jan 1, 2011
Citations: 9	License type: cc-by

R Discovery Prime

R Discovery Prime

Architecture Aware Programming on Multi-Core Systems

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications

Lead the way for us

Similar Papers

Improved Data Locality Using Morton-order Curve on the Example of LU Decomposition
Martin Perdacher ... Claudia Plant
-
Martin Perdacher, et. al.Martin Perdacher ... Claudia Plant
10 Dec 2020
10 Dec 2020

Cache Oblivious Matrix Operations Using Peano Curves
Michael Bader ... Christian Mayer
-
Michael Bader, et. al.Michael Bader ... Christian Mayer
01 Jan 2007
01 Jan 2007

A theoretical framework for memory-adaptive algorithms
R.D Barve ... J.S Vitter
-
R.D Barve, et. al.R.D Barve ... J.S Vitter
17 Oct 1999
17 Oct 1999

Hardware-Oriented Implementation of Cache Oblivious Matrix Operations Based on Space-Filling Curves
Michael Bader ... Stephan Günther
-
Michael Bader, et. al.Michael Bader ... Stephan Günther
09 Sep 2007
09 Sep 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Architecture Aware Programming on Multi-Core Systems

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications