Good Speedup Research Articles

Configurable arithmetic logic units (ALUs) offer opportunities for adapting the underlying hardware to support the varying amount of parallelism in the computation. The problem of identifying the optimal parallel configurations (a configuration is defined as a given hardware implementation of different operators along with their multiplicities) at different steps in a program is a very complex issue but, if solved, allows the power of these ALUs to be maximally used. This paper focuses on developing an automatic compilation framework for configuration analysis to exploit operator parallelism within loop nests. The focus of this work is on performing configuration analysis to minimize costly reconfiguration overheads. In our framework, we initially carry out some operator and loop transformations to expose more opportunities for configuration reuse. We then present a two pass solution. The first pass attempts to generate either maximal cutsets (a cutset is defined as a group of statements that execute under a given configuration) or maximally parallel configurations by performing an analysis on the program dependency graph (PDG) of a loop nest. The second pass analyzes the trade-offs between the costs and benefits of reconfigurations across different cutsets and attempts to eliminate the reconfiguration overheads by merging cutsets. This methodology is implemented in the SUIF compilation system and is tested using some loops extracted from Perfect benchmarks and Livermore kernels. Good speedups are obtained, showing the merit of the proposed method. The method also scales well with the loop sizes and the amount of space available on FPGAs for configurable logic.

Read full abstract

Multidimensional analysis and online analytical processing (OLAP) operations require summary information on multidimensional data sets. Most common are aggregate operations along one or more dimensions of numerical data values. Simultaneous calculation of multidimensional aggregates are provided by the Data Cube operator, used to calculate and store summary information on a number of dimensions. This is computed only partially if the number of dimensions is large. Query processing for these applications requires different views of data to gain insight and for effective decision support. Queries may either be answered from a materialized cube in the data cube or calculated on the fly. The multidimensionality of the underlying problem can be represented both in relational and in multidimensional databases, the latter being a better fit when query performance is the criteria for judgment. Relational databases are scalable in size for OLAP and multidimensional analysis and efforts are on to make their performance acceptable. On the other hand multidimensional databases have proven to provide good performance for such queries, although they are not very scalable. In this article we address (1) scalability in multidimensional systems for OLAP and multidimensional analysis and (2) integration of data mining with the OLAP framework. We describe our system PARSIMONY, parallel and scalable infrastructure for multidimensional online analytical processing, used for both OLAP and data mining. Sparsity of data sets is handled by using chunks to store data either as a dense block using multidimensional arrays or as sparse representation using a bit encoded sparse structure. Chunks provide a multidimensional index structure for efficient dimension oriented data accesses much the same as multidimensional arrays do. Operations within chunks and between chunks are a combination of relational and multidimensional operations depending on whether the chunk is sparse or dense. Further, we develop parallel algorithms for data mining on the multidimensional cube structure for attribute-oriented association rules and decision-tree-based classification. These take advantage of the data organization provided by the multidimensional data model. Performance results for high dimensional data sets on a distributed memory parallel machine (IBM SP-2) show good speedup and scalability.

Read full abstract

Good Speedup Research Articles

Related Topics

Articles published on Good Speedup

Solving unsymmetric sparse systems of linear equations with PARDISO

Solving unsymmetric sparse systems of linear equations with PARDISO

Parallel Computing of an Integral Formulation of Transient Radiation Transport

Phoenix

P‐Jigsaw: a cluster‐based Web server with cooperative caching support

Performance of a distributed architecture for query processing on workstation clusters

Distributed Maple: parallel computer algebra in networked environments

A Neural Network Implementation For Data Assimilation Using MPI

Parallelizing graph construction operations in programs with cyclic graphs

Parallel PSM/FDM Hybrid Simulation of Ground Motions from the 1999 Chi-Chi, Taiwan, Earthquake

Dust Dynamics in Protoplanetary Disks: Parallel Computing with PVM

One-by-One Cleaning for Practical Parallel List Ranking

OpenMP Programming for a Global Inverse Model

Automatic compilation of loops to exploit operator parallelism on configurable arithmetic logic units

Orbital debris impact simulation using a parallel hybrid particle-element code

Point‐to‐point and multi‐goal path planning for industrial robots

Local versus global lookahead in conservative parallel simulations

Continuous genetic networks

Parallel Sequence Mining on Shared-Memory Machines

PARSIMONY: An Infrastructure for Parallel Multidimensional Analysis and Data Mining

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Good Speedup Research Articles

Related Topics

Articles published on Good Speedup

Solving unsymmetric sparse systems of linear equations with PARDISO

Solving unsymmetric sparse systems of linear equations with PARDISO

Parallel Computing of an Integral Formulation of Transient Radiation Transport

Phoenix

P‐Jigsaw: a cluster‐based Web server with cooperative caching support

Performance of a distributed architecture for query processing on workstation clusters

Distributed Maple: parallel computer algebra in networked environments

A Neural Network Implementation For Data Assimilation Using MPI

Parallelizing graph construction operations in programs with cyclic graphs

Parallel PSM/FDM Hybrid Simulation of Ground Motions from the 1999 Chi-Chi, Taiwan, Earthquake

Dust Dynamics in Protoplanetary Disks: Parallel Computing with PVM

One-by-One Cleaning for Practical Parallel List Ranking

OpenMP Programming for a Global Inverse Model

Automatic compilation of loops to exploit operator parallelism on configurable arithmetic logic units

Orbital debris impact simulation using a parallel hybrid particle-element code

Point‐to‐point and multi‐goal path planning for industrial robots

Local versus global lookahead in conservative parallel simulations

Continuous genetic networks

Parallel Sequence Mining on Shared-Memory Machines

PARSIMONY: An Infrastructure for Parallel Multidimensional Analysis and Data Mining