Sequential Processor Research Articles

We propose architectures based on Data-Driven Multithreading (DDM), a hybrid control-flow/data-flow model, to address the concurrency challenges faced by future High-Performance Computing (HPC) systems. We focus on the design and implementation of an optimized hardware Thread Scheduling Unit (TSU) and its integration into a multi-core system dubbed MiDAS. The TSU is the core of the DDM model and it orchestrates the execution of multiple threads on sequential processors based on data availability. MiDAS was prototyped on a Xilinx Virtex-6 FPGA and extensively evaluated using several micro-benchmarks, showing that it achieves linearly-growing performance as the processing core count increases even when running benchmarks comprising very small problem sizes. Under the largest problem size tested and with all 8 available cores being utilized, MiDAS achieves an average speedup of 7.91×, exhibiting 98.8% utilization efficiency. Further, several results pertaining to the proposed hardware TSU are provided, including FPGA real estate requirements, where it is found that MiDAS’s TSU demands relatively small overheads and reduced power consumption, while various TSU operations adhere to low latency responses. To back said claims, the proposed DDM-based TSU is compared with the Task Superscalar architecture that implements the StarSs programming framework in hardware. As such, comparison results show that the proposed TSU requires much less of both hardware investment and energy consumption to operate. Specifically, Task Superscalar is found to be 4.94 × larger than the DDM-supporting TSU in terms of slice register requirements and 11.34 × larger with respect to the slice look-up table count. Last, the hardware TSU is compared with a software TSU implementation offering identical functionalities, with both being run on an FPGA fabric under a synthetic application, where a detailed performance evaluation shows that MiDAS’s hardware-implemented TSU significantly outperforms its software-based TSU counterpart.

Abstract Modern digital processing applications have an increasing demand for computational power while needing to preserve low power dissipation and high flexibility. For many applications, the growth of algorithmic complexity is already faster than the growth of computational power provided by discrete general-purpose processors. A typical approach to address this problem is the combination of a processor core with dedicated accelerators. Since changes in standards or algorithms can change the demands on the accelerators, an attractive alternative to highly customized VLSI macros is suggested with the usage of reconfigurable embedded FPGAs (eFPGAs). Keyword: embedded FPGA, Fast computing, Hybrid design. ------------------------------------------------------------------------***---------------------------------------------------------------------- 1. INTRODUCTION FPGAs are widely used as an attractive compromise between highly efficient physically optimized VLSI designs and software programmable processors. Due to their reconfigurability, FPGAs are highly flexible and allow for relatively short design cycles since no physical changes to the underlying hardware have to be made in case of a redesign. However, they offer lower physical implementation costs compared to software programmable processors, as the inherent parallelism of many algorithms can be exploited in contrast to sequential processor architectures. As a result, commercial FPGA-architectures have been optimized to suit a wide variety of applications from network related and digital signal processing to the realization of soft-core processors. For an embedded FPGA used as configurable accelerator, however, the requirements concerning the provided resources are often well defined and much narrower than for discrete or “general purpose” FPGAs. Hence, eFPGAs can be optimized for a certain set of applications and thus achieve higher efficiency in terms of power dissipation, area and speed. First investigations on a reconfigurable ASIP with a reconfigurable accelerator based on a parametrisable eFPGA-architecture have shown significant improvements in energy- and area-efficiency [5].

Sequential Processor Research Articles

Related Topics

Articles published on Sequential Processor

An efficient implementation of one-dimensional discrete wavelet transform algorithms for GPU architectures

Hierarchical Design of a Secure Image Sensor with Dynamic Reconfiguration

A high throughput two-dimensional discrete cosine transform and MPEG4 motion estimation using vector coprocessor

Toward data-driven architectural support in improving the performance of future HPC architectures

ABC-PLOSS: a software tool for path-loss minimisation in GSM telecom networks using artificial bee colony algorithm

ABC-PLOSS: a software tool for path-loss minimisation in GSM telecom networks using artificial bee colony algorithm

P-HS-SFM: a parallel harmony search algorithm for the reproduction of experimental data in the continuous microscopic crowd dynamic models

Data-Driven Concurrency for High Performance Computing

Neutron Multiplicity Counting: Credible Regions for Reconstruction Parameters

SoPC Self-Integration Mechanism for Seamless Architecture Adaptation to Stream Workload Variations

A Hybrid Reconfigurable Architecture and Design Methods Aiming at Control-Intensive Kernels

Processor arrays generation for matrix algorithms used in embedded platforms implemented on FPGAs

Architectural Support for Data-Driven Execution

Projektowanie procesora sekwencyjnego i symulacja w środowisku MATLAB/Simulink

Knowledge acquisition of vibrations in high-power transformers using statistical analyses and fuzzy approaches – A case study

GPU enhanced parallel computing for large scale data clustering

VLSI DESIGN PROCESS FOR LOW POWER DESIGN METHODOLOGY USING RECONFIGURABLE FPGA

Finite difference schemes for heat conduction analysis in integrated circuit design and manufacturing

Parallelized and pipelined hardware implementation of computationally expensive prediction filters

Families of algorithms related to the inversion of a Symmetric Positive Definite matrix

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Sequential Processor Research Articles

Related Topics

Articles published on Sequential Processor

An efficient implementation of one-dimensional discrete wavelet transform algorithms for GPU architectures

Hierarchical Design of a Secure Image Sensor with Dynamic Reconfiguration

A high throughput two-dimensional discrete cosine transform and MPEG4 motion estimation using vector coprocessor

Toward data-driven architectural support in improving the performance of future HPC architectures

ABC-PLOSS: a software tool for path-loss minimisation in GSM telecom networks using artificial bee colony algorithm

ABC-PLOSS: a software tool for path-loss minimisation in GSM telecom networks using artificial bee colony algorithm

P-HS-SFM: a parallel harmony search algorithm for the reproduction of experimental data in the continuous microscopic crowd dynamic models

Data-Driven Concurrency for High Performance Computing

Neutron Multiplicity Counting: Credible Regions for Reconstruction Parameters

SoPC Self-Integration Mechanism for Seamless Architecture Adaptation to Stream Workload Variations

A Hybrid Reconfigurable Architecture and Design Methods Aiming at Control-Intensive Kernels

Processor arrays generation for matrix algorithms used in embedded platforms implemented on FPGAs

Architectural Support for Data-Driven Execution

Projektowanie procesora sekwencyjnego i symulacja w środowisku MATLAB/Simulink

Knowledge acquisition of vibrations in high-power transformers using statistical analyses and fuzzy approaches – A case study

GPU enhanced parallel computing for large scale data clustering

VLSI DESIGN PROCESS FOR LOW POWER DESIGN METHODOLOGY USING RECONFIGURABLE FPGA

Finite difference schemes for heat conduction analysis in integrated circuit design and manufacturing

Parallelized and pipelined hardware implementation of computationally expensive prediction filters

Families of algorithms related to the inversion of a Symmetric Positive Definite matrix