Compilation Flow Research Articles

Usual periodic scheduling problems deal with precedence constraints having non-negative latencies. This seems a natural way for modelling scheduling problems, since task delays are generally non-negative quantities. However, in some cases, we need to consider edges latencies that do not only model task latencies, but model other precedence constraints. For instance in register optimisation problems devoted to optimising compilation, a generic machine or processor model can allow considering access delays into/from registers. Edge latencies may be then non-positive leading to a difficult scheduling problem in presence of resources constraints. This research result is related to the problem of periodic scheduling with storage requirement optimisation; its aims is to solve the practical problem of register optimisation in optimising compilation. We show that pre-conditioning a data dependence graph (DDG) to satisfy register constraints before periodic scheduling under resources constraints may create circuits with non-positive distances, resulted from the acceptance of non-positive edge latencies. As a compiler construction strategy, it is forbidden to allow the creation of circuits with non-positive distances during the compilation flow, because such DDG circuits do not guarantee the existence of a valid instruction schedule under resource constraints. We study two solutions to avoid the creation of these problematic circuits. A first solution is reactive, it tolerates the creation of non-positive circuit in a first step, and if detected in a further check step, makes a backtrack to eliminate them. A second solution is proactive, it prevents the creation of non-positive circuits in the DDG during the register optimisation process. It is based on shortest path equations which define a necessary and sufficient condition to free any DDG from these problematic circuits. Then we deduce a linear program accordingly. We have implemented our solutions and we present successful experimental results.

This paper presents an architecture that combines VLIW (very long instruction word) processing with the capability to introduce application-specific customized instructions and highly parallel combinational hardware functions for the acceleration of signal processing applications. To support this architecture, a compilation and design automation flow is described for algorithms written in C. The key contributions of this paper are as follows: (1) a 4-way VLIW processor implemented in an FPGA, (2) large speedups through hardware functions, (3) a hardware/software interface with zero overhead, (4) a design methodology for implementing signal processing applications on this architecture, (5) tractable design automation techniques for extracting and synthesizing hardware functions. Several design tradeoffs for the architecture were examined including the number of VLIW functional units and register file size. The architecture was implemented on an Altera Stratix II FPGA. The Stratix II device was selected because it offers a large number of high-speed DSP (digital signal processing) blocks that execute multiply-accumulate operations. Using the MediaBench benchmark suite, we tested our methodology and architecture to accelerate software. Our combined VLIW processor with hardware functions was compared to that of software executing on a RISC processor, specifically the soft core embedded NIOS II processor. For software kernels converted into hardware functions, we show a hardware performance multiplier of up to times that of software with an average times faster. For the entire application in which only a portion of the software is converted to hardware, the performance improvement is as much as 30X times faster than the nonaccelerated application, with a 12X improvement on average.

Compilation Flow Research Articles

Related Topics

Articles published on Compilation Flow

An Energy-Efficient Integrated Programmable Array Accelerator and Compilation Flow for Near-Sensor Ultralow Power Processing

Compiler-Assisted Loop Hardening Against Fault Attacks

Providing fault tolerance through invasive computing

A Coarse-Grained Reconfigurable Architecture for Compute-Intensive MapReduce Acceleration

An OpenCL software compilation framework targeting an SoC-FPGA VLIW chip multiprocessor

A New Compilation Flow for Software-Defined Radio Applications on Heterogeneous MPSoCs

A First Step to Performance Prediction for Heterogeneous Processing on Manycores

Giving Text Analytics a Boost

VOBLA

Exploiting Task- and Data-Level Parallelism in Streaming Applications Implemented in FPGAs

How to eliminate non-positive circuits in periodic scheduling: a proactive strategy based on shortest path equations

OpenStream

LALP: A Language to Program Custom FPGA-Based Acceleration Engines

Thermal-Aware Compilation for Register Window-Based Embedded Processors

SysCellC: a data-flow programming model on multi-GPU

Speedups from extending embedded processors with a high-performance coarse-grained reconfigurable data-path

Enabling compiler flow for embedded VLIW DSP processors with distributed register files

Rapid VLIW Processor Customization for Signal Processing Applications Using Combinational Hardware Functions

Incremental compilation for parallel logic verification systems

Hydrogeologic studies for nuclear-waste disposal in Sweden

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Compilation Flow Research Articles

Related Topics

Articles published on Compilation Flow

An Energy-Efficient Integrated Programmable Array Accelerator and Compilation Flow for Near-Sensor Ultralow Power Processing

Compiler-Assisted Loop Hardening Against Fault Attacks

Providing fault tolerance through invasive computing

A Coarse-Grained Reconfigurable Architecture for Compute-Intensive MapReduce Acceleration

An OpenCL software compilation framework targeting an SoC-FPGA VLIW chip multiprocessor

A New Compilation Flow for Software-Defined Radio Applications on Heterogeneous MPSoCs

A First Step to Performance Prediction for Heterogeneous Processing on Manycores

Giving Text Analytics a Boost

VOBLA

Exploiting Task- and Data-Level Parallelism in Streaming Applications Implemented in FPGAs

How to eliminate non-positive circuits in periodic scheduling: a proactive strategy based on shortest path equations

OpenStream

LALP: A Language to Program Custom FPGA-Based Acceleration Engines

Thermal-Aware Compilation for Register Window-Based Embedded Processors

SysCellC: a data-flow programming model on multi-GPU

Speedups from extending embedded processors with a high-performance coarse-grained reconfigurable data-path

Enabling compiler flow for embedded VLIW DSP processors with distributed register files

Rapid VLIW Processor Customization for Signal Processing Applications Using Combinational Hardware Functions

Incremental compilation for parallel logic verification systems

Hydrogeologic studies for nuclear-waste disposal in Sweden