Abstract

A number of highly-threaded, many-core architectures hide memory-access latency by low-overhead context switching among a large number of threads. The speedup of a program on these machines depends on how well the latency is hidden. If the number of threads were infinite, theoretically, these machines could provide the performance predicted by the PRAM analysis of these programs. However, the number of threads per processor is not infinite, and is constrained by both hardware and algorithmic limits. In this paper, we introduce the Threaded Many-core Memory (TMM) model which is meant to capture the important characteristics of these highly-threaded, many-core machines. Since we model some important machine parameters of these machines, we expect analysis under this model to provide a more fine-grained and accurate performance prediction than the PRAM analysis. We analyze 4 algorithms for the classic all pairs shortest paths problem under this model. We find that even when two algorithms have the same PRAM performance, our model predicts different performance for some settings of machine parameters. For example, for dense graphs, the dynamic programming algorithm and Johnson’s algorithm have the same performance in the PRAM model. However, our model predicts different performance for large enough memory-access latency and validates the intuition that the dynamic programming algorithm performs better on these machines. We validate several predictions made by our model using empirical measurements on an instantiation of a highly-threaded, many-core machine, namely the NVIDIA GTX 480.

Highlights

  • A Memory Access Model for Highly-threaded Many-core ArchitecturesMany-core architectures are excellent in hiding memory-access latency by low-overhead context switching among a large number of threads

  • Highly-threaded, many-core devices such as GPUs have gained popularity in the last decade; both NVIDIA and AMD manufacture general purpose GPUs that fall in this category

  • Many-core architectures are excellent in hiding memory-access latency by low-overhead context switching among a large number of threads

Read more

Summary

A Memory Access Model for Highly-threaded Many-core Architectures

Many-core architectures are excellent in hiding memory-access latency by low-overhead context switching among a large number of threads. If the number of threads were infinite, theoretically these machines should provide the performance predicted by the PRAM analysis of the programs. The number of allowable threads per processor is not infinite. We introduce the Threaded Many-core Memory (TMM) model which is meant to capture the important characteristics of these highly-threaded, many-core machines. Follow this and additional works at: https://openscholarship.wustl.edu/cse_research Part of the Computer Engineering Commons, and the Computer Sciences Commons. Recommended Citation Ma, Lin; Agrawal, Kunal; and Chamberlain, Roger D., "A Memory Access Model for Highly-threaded Manycore Architectures" Report Number: WUCSE-2012-64 (2012). This technical report is available at Washington University Open Scholarship: https://openscholarship.wustl.edu/ cse_research/89

Complete Abstract
INTRODUCTION
MODELING
Many-core Architectures
TMM Model Parameters
TMM Analysis structure
ANALYSIS OF ALL PAIRS SHORTEST PATHS ALGORITHMS USING TMM MODEL
Floyd-Warshall Algorithm
Johnson’s Algorithm
3: Input: S is source vertex 4
COMPARISON OF THE VARIOUS ALGORITHMS
Influence of Machine Parameters
Influence of Graph Size
Vertices Fit in Local Memory
Edges Fit in the Combined Local Memories
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.