Thinking Machines Research Articles

Designers of computer graphics hardware have used increasing device counts available from IC manufacturers to increase parallelism using techniques such as putting a longer pipeline of data path elements on integrated circuits, or developing designs which use an array of processors. Pixel-Planes 1–5 and PixelFlow ‡ ‡ Pixel-Planes 1–5 and PixelFlow were developed at the University of North Carolina. are examples of architectures which use an array of pixel processors for rasterization. Early generations of Pixel-Planes attempted to make these arrays as large as the display providing one processor for each display pixel. Later generations improved performance by grouping processors into multiple smaller arrays, subdividing the screen into sections of a corresponding size and having the arrays independently process the screen subdivisions. This paper describes simulations which were performed to determine the optimum size subdivision for a graphics computer which uses Pixel-Planes type parallelism, i.e. static two dimensional screen subdivision parallel polygon rasterization. We then develop a mathematical approach to determining the optimal subdivision size and show that it agrees well with the experimental data. For special purpose architectures we show that the optimal size depends not only on the polygon size but also on the silicon area consumed by the rasterizer overhead. The mathematical approach can be directly applied to special purpose architectures, and we show how it can be modified for use in analyzing algorithms developed for general purpose architectures such as the Intel Touchstone or Paragon, or the Thinking Machines CM-5.

Read full abstract

For distributed-memory multicomputers such as the Intel Paragon, the IBM SP-1/SP-2, the NCUBE/2, and the Thinking Machines CM-5, the quality of the data partitioning for a given application is crucial to obtaining high performace. This task has traditionally been the user's responsibility, but in recent years much effort has been directed to automating the selection of data partitioning schemes. Several researchers have proposed systems that are able to produce data distributons that remain in effect for the entire execution of an application. For complex programs, however, such static data distributions may be insufficient to obtain acceptable performance. The selection of distributions that dynamically change over the course of a program's execution adds another dimension to the data partitioning problem. In this paper, we present a technique that can be used to automatically determine which partitionings are most beneficial over specific sections of a program while taking into account the added overhead of performing redistribution. This system has been implemented as part of the PARADIGM (PARAllelizing compiler for DIstributed-memory General-purpose Multicomputers) project at the University of Illinois. The complete system strives to provide a fully automated means to parallelize programs written in a serial programming model obtaining high performance on a wide range of distributed-memory multicomputers.

Read full abstract

Thinking Machines Research Articles

Related Topics

Articles published on Thinking Machines

A new deterministic parallel sorting algorithm with an experimental evaluation

A massively parallel implementation of a discrete-time algorithm for the computation of dynamic elastic demand traffic problems modeled as projected dynamical systems

A Randomized Parallel Sorting Algorithm with an Experimental Study

An Sn Algorithm for the Massively Parallel CM-200 Computer

3D computation of unsteady flow past a sphere with a parallel finite element method

Data-Parallel Sparse Factorization

Parallel 3-D pseudospectral simulation of seismic wave propagation

The Mind and the Thinking Machine

Parallel 3D computation of unsteady flows around circular cylinders

Parallel computational methods for 3D simulation of a parafoil with prescribed shape changes

Parallel computation of incompressible flows with complex geometries

Parallel finite element methods for large-scale computation of storm surges and tidal flows

Implementation of a Parallel Unstructured 3D Euler Solver on the CM-5

Optimal static 2-dimensional screen subdivision for parallel rasterization architectures

Performance of a Fully Parallel Sparse Solver

A framework for exploiting task and data parallelism on distributed memory multicomputers

Optimizations for Efficient Array Redistribution on Distributed Memory Multicomputers

Dynamic Data Partitioning for Distributed-Memory Multicomputers

Software Caching and Computation Migration in Olden

Evaluation of architectural support for global address-based communication in large-scale parallel machines

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Thinking Machines Research Articles

Related Topics

Articles published on Thinking Machines

A new deterministic parallel sorting algorithm with an experimental evaluation

A massively parallel implementation of a discrete-time algorithm for the computation of dynamic elastic demand traffic problems modeled as projected dynamical systems

A Randomized Parallel Sorting Algorithm with an Experimental Study

An Sn Algorithm for the Massively Parallel CM-200 Computer

3D computation of unsteady flow past a sphere with a parallel finite element method

Data-Parallel Sparse Factorization

Parallel 3-D pseudospectral simulation of seismic wave propagation

The Mind and the Thinking Machine

Parallel 3D computation of unsteady flows around circular cylinders

Parallel computational methods for 3D simulation of a parafoil with prescribed shape changes

Parallel computation of incompressible flows with complex geometries

Parallel finite element methods for large-scale computation of storm surges and tidal flows

Implementation of a Parallel Unstructured 3D Euler Solver on the CM-5

Optimal static 2-dimensional screen subdivision for parallel rasterization architectures

Performance of a Fully Parallel Sparse Solver

A framework for exploiting task and data parallelism on distributed memory multicomputers

Optimizations for Efficient Array Redistribution on Distributed Memory Multicomputers

Dynamic Data Partitioning for Distributed-Memory Multicomputers

Software Caching and Computation Migration in Olden

Evaluation of architectural support for global address-based communication in large-scale parallel machines