Construction Of Parallel Programs Research Articles

Purpose:CAPIDS is a unique software platform designed to control and acquire images from the high‐resolution MAF detector, process them and display them in a clinical environment. The images are then stored for optional playback at a later stage. CAPIDS also acquires and records the exposure parameters from the x‐ray unit. We present new parallel programming modifications using the host computer system's Graphics Processing Unit (GPU) and Central Processing unit (CPU) to improve the system performance for the various MAF imaging tasks.Methods:Multicore CPU's allow for concurrent tasks to be executed at the same time in parallel. During runtime, CAPIDS has three concurrent tasks: image acquisition and processing, image saving, and exposure parameter acquisition. Parallel programming constructs from LabVIEW allow each tasks to be executed on a separate core.GPU's allow for the same task to be performed on independent data sets in parallel. During runtime, all the image processing including flat‐field correction, digital image subtraction, image averaging, and temporal recursive filtering are performed on the GPU.Results:The new version of CAPIDS with all the parallel programming updates was successfully used for the first time to control the MAF, acquire the images, process the images and display the images during an actual clinical intervention. The images were acquired under fluoroscopy, digital subtraction angiography, and roadmap modes.Conclusion:Distributing concurrent tasks to different cores of a multicore CPU results in an efficient utilization of resources, efficient power management and increases operation speed. Use of GPU's for image processing further enhances the speed of operation.Supported by NIH Grant: 2R01EB002873 and an equipment grant from Toshiba Medical Systems Corporation

Read full abstract

Tomorrow's microprocessors will be able to handle multiple flows of control. Applications that exhibit task level parallelism (TLP) and can be decomposed into parallel tasks will perform well on these platforms. TLP arises when a task is independent of its neighboring code. Traditional parallel compilers exploit one variety of TLP, loop level parallelism (LLP), where loop iterations are executed in parallel. LLP can overwhelming be found in numeric, typically FORTRAN programs with regular patterns of data accesses. In contrast, irregular applications, typified by general purpose integer applications, exhibit little LLP as they tend to access data in irregular patterns through pointers. Without pointer disambiguation to analyze data access dependences, traditional parallel compilers cannot parallelize these irregular applications and ensure correct execution.We focus on a different variety of TLP, namely Speculative Task Parallelism (STP). STP arises when a task (either a leaf-procedure, a non-leaf procedure or an entire loop) is control- and memory-independent of its preceding code, and thus could be executed in parallel. Two sections of code are memory-independent when neither contains a store to a memory location that the other accesses. To exploit STP, we assume a hypothetical speculative machine that supports speculative futures (a parallel programming construct that executes a task early on a different thread or processor) with mechanisms for resolving incorrect speculation when the task is not, after all, independent. This allows us to speculatively parallelize code when there is a high probability of independence, but no guarantee.Figure 1 illustrates STP, showing a task Y in the dynamic instruction stream of an irregular application that has no memory access conflicts with a group of instructions, X, that precede Y. The shorter of X and Y determines the overlap of memory-independent instructions as seen in Figures 1(b) and 1(c). In the absence of any register dependences, X and Y may be executed in parallel, resulting in shorter execution time. It is hard for traditional parallel compilers of pointer-based languages to expose this parallelism.The goals of this paper are to identify such regions as X and Y within irregular applications and to find the number of instructions that may thus be removed from the critical path. This number represents the maximum STP when the cost of exploiting STP is zero.Because the biggest barrier to detecting independence in irregular codes is memory disambiguation, we identify memory-independent tasks using a profile-based approach and measure the amount of STP by estimating the amount of memory-independent instructions those tasks expose. We vary the level of control dependence and memory dependence to investigate their effect on the amount of memory-independence we find. We profile at different memory granularities and introduce synchronization to expose higher levels of memory-independence. Across this variety of speculation assumptions, 7 to 22% of dynamic instructions are within tasks that are found to be memory-independent. This was on the SPECint95 benchmarks, a set of irregular applications for which traditional methods of parallelization are ineffective.

Read full abstract

Construction Of Parallel Programs Research Articles

Related Topics

Articles published on Construction Of Parallel Programs

OPTIMIZING THE OPERATION OF FRAGMENTED PROGRAMS BASED ON TRACES

Parallel programs execution optimization using behavior control in LuNA system

GVF snake algorithm-a parallel approach

Scalable distributed data allocation in LuNA fragmented programming system

Implementation of a three dimensional three-phase fluid flow (“oil–water–gas”) numerical model in LuNA fragmented programming system

SU‐E‐I‐83: Parallel Programming Upgrades for the Control Acquisition, Processing and Image Display System (CAPIDS) of the Micro Angiographic Fluoroscope (MAF)

A parallel log-barrier method for mesh quality improvement and untangling

As-if-serial exception handling semantics for Java futures

On Data Distributions in the Construction of Parallel Programs

Concurrent Programming in C Made Easy

A transformation approach to derive efficient parallel implementations

Limits of task-based parallelism in irregular applications

An optical bus-based distributed dynamic barrier mechanism

Behaviour specification of parallel active objects

Refining multiset transformers

Evaluating the Effect of Coherence Protocols on the Performance of Parallel Programming Constructs

The interaction of parallel programming constructs and coherence protocols

Abstract Level Parallelization of Finite Difference Methods

Implementation of GAMMA on a massively parallel computer

Linear logic automata

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Construction Of Parallel Programs Research Articles

Related Topics

Articles published on Construction Of Parallel Programs

OPTIMIZING THE OPERATION OF FRAGMENTED PROGRAMS BASED ON TRACES

Parallel programs execution optimization using behavior control in LuNA system

GVF snake algorithm-a parallel approach

Scalable distributed data allocation in LuNA fragmented programming system

Implementation of a three dimensional three-phase fluid flow (“oil–water–gas”) numerical model in LuNA fragmented programming system

SU‐E‐I‐83: Parallel Programming Upgrades for the Control Acquisition, Processing and Image Display System (CAPIDS) of the Micro Angiographic Fluoroscope (MAF)

A parallel log-barrier method for mesh quality improvement and untangling

As-if-serial exception handling semantics for Java futures

On Data Distributions in the Construction of Parallel Programs

Concurrent Programming in C Made Easy

A transformation approach to derive efficient parallel implementations

Limits of task-based parallelism in irregular applications

An optical bus-based distributed dynamic barrier mechanism

Behaviour specification of parallel active objects

Refining multiset transformers

Evaluating the Effect of Coherence Protocols on the Performance of Parallel Programming Constructs

The interaction of parallel programming constructs and coherence protocols

Abstract Level Parallelization of Finite Difference Methods

Implementation of GAMMA on a massively parallel computer

Linear logic automata