Abstract

Vector processors, parallel processors, and sequential processors with much concurrency are considered. Some real codes can be shown to be almost completely suitable for parallel and vector processors . However, vector to scalar speed ratios can be so large for the vector and parallel machines that even a small residual scalar content is a serious problem. Monte Carlo approaches and implicit differencing schemes yield shorter vectors than explicit differencing schemes, causing problems due to long vector operation start up t imes. Programming in a vector language may be a more global and easier to understand approach to programming. It is necessary to use such an approach in order to attain some reasonable fraction of the potential speed of a vector or parallel processor. Introduction and Summary The utility of various scientific processor architectures is weighed against a set of codes used at Lawrence Livermore Laboratory (LLL). On the basis of experience gained in the development of parallel programs, vector machines, parallel machines, and concurrent machines are considered. The idea of utility is interpreted to relate to the question of how useful the maximum speeds of processors are when an entire program, rather than small parts, such as inner loops, is the weighting factor. Utility is also associated with the range of classes of numerical techniques which can be comfortably and easily mated to a given processor architecture. It is desirable to consider ease of programming as an aspect of utility. It has not been possible to consider this as quantitatively as the other questions. But some comments are made on the subject. The future may bring better understanding of general programming approaches simultaneously suited to many classes of processor. For this paper, it is usually assumed that coding is done in an appropriate assembly language, or in a language especially well mated to a given architecture. The point of view has been adopted that processing speed in computational physics is closely related to the average number of floating point operations done per unit of time. It is recognized that this approach is not entirely correct, but it is better than relating speed to instructions executed per unit t ime. Hardware designers face common technological l imits. Various design approaches have been the result of the competition to achieve maximum atility within these limits. As a result, parallel processors, vector processors, and serial processors with instruction execution overlap (concurrency) have emerged as contenders for maximum scientific utility.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call