Abstract

AbstractIn the past, parallel algorithms were developed, for the most part, under the assumption that the number of processors is Θ(n) (where n is the size of the input) and that if in practice the actual number was smaller, this could be resolved using Brent’s Lemma to simulate the highly parallel solution on a lower-degree parallel architecture. In this paper, however, we argue that design and implementation issues of algorithms and architectures are significantly different—both in theory and in practice—between computational models with high and low degrees of parallelism. We report an observed gap in the behavior of a parallel architecture depending on the number of processors. This gap appears repeatedly in both empirical cases, when studying practical aspects of architecture design and program implementation as well as in theoretical instances when studying the behaviour of various parallel algorithms. It separates the performance, design and analysis of systems with a sublinear number of processors and systems with linearly many processors. More specifically we observe that systems with either logarithmically many cores or with O(n α) cores (with α < 1) exhibit a qualitatively different behavior than a system with a linear number of cores on the size of the input, i.e., Θ(n). The evidence we present suggests the existence of a sharp theoretical gap between the classes of problems that can be efficiently parallelized with o(n) processors and with Θ(n) processors unless P = NC.KeywordsMemory AccessParallel AlgorithmTuring MachineParallel ArchitectureInput SizeThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call