Abstract

The comprehensive analysis of algorithmic properties of well-known. Cholesky decomposition was performed on the basis of multifold AlgoWiki technologies. There was performed a detailed analysis of information graph, data structure, memory access profile, computation locality, scalability and other algorithm properties, that allow us to demonstrate a lot of unevident properties split up. into machine-independent and machine-dependent subsets. A comprehension of the parallel algorithm structure provide us with the possibility to efficiently implement the algorithm at hardware platform specified.

Highlights

  • The fundamental problem of high-performance computing consists in the accurate coordination between algorithm and program structure and hardware features that results in high efficiency

  • This paper presents a comprehensive analysis of Cholesky decomposition algorithm

  • On the basis of the Cholesky decomposition algorithm properties we demonstrate a mathematical algorithm description, the analysis of sequential and parallel complexity, the information graph of the algorithm, properties of software implementations, analysis of data locality, scalability, as well as dynamic characteristics and performance efficiency of the algorithm implementation

Read more

Summary

Introduction

The fundamental problem of high-performance computing consists in the accurate coordination between algorithm and program structure and hardware features that results in high efficiency. All fundamental algorithmic properties that determine implementation efficiencies on modern computing platforms are divided into machine-dependent and machine-independent subsets. This division is made intentionally in order to separate these features of algorithms, which define their perspective implementations on parallel computational systems from a range of questions associated with consequent stages of programming and execution of the resulting programs on particular computing systems. To demonstrate the analysis of the parallel processing model we have chosen one of the most popular algorithms — the Cholesky decomposition This algorithm is widely used for the direct solution of dense and sparse linear systems. On the basis of the detailed algorithm analysis we give the conclusions on the most efficient algorithm implementation according to facilities of the supercomputer used

General description
Mathematical description and information graph
Parallelization resources and other properties of the algorithm
Sequential implementation of the algorithm
Structure of memory access and locality estimation
Approaches and features of parallel implementations
Scalability and dynamic characteristics of the algorithm implementation
Computer architectures and existing implementations
Conclusion
AlgoWiki
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call