Abstract

This paper presents and evaluates an approach to deploy image and video processing pipelines that are developed frame-oriented on a hardware platform that is stream-oriented, such as an FPGA. First, this calls for a specialized streaming memory hierarchy and accompanying software framework that transparently moves image segments between stages in the image processing pipeline. Second, we use softcore VLIW processors, that are targetable by a C compiler and have hardware debugging capabilities, to evaluate and debug the software before moving to a High-Level Synthesis flow. The algorithm development phase, including debugging and optimizing on the target platform, is often a very time consuming step in the development of a new product. Our proposed platform allows both software developers and hardware designers to test iterations in a matter of seconds (compilation time) instead of hours (synthesis or circuit simulation time).

Highlights

  • The goal of interventional medical imaging equipment is to provide the physician with real-time images from the anatomy of the patient while performing a medical intervention

  • In order to address these issues, we have investigated enablers for portability towards FPGAs exploiting novel tools and techniques such as High Level Synthesis (HLS) tools

  • We propose an approach to solve these challenges by using an FPGA overlay fabric consisting of softcore processors that are targetable by OpenCL and a streaming memory framework

Read more

Summary

Introduction

The goal of interventional medical imaging equipment is to provide the physician with real-time images from the anatomy of the patient while performing a medical intervention. The image processing algorithms are often closely tuned to the platform architecture. This makes it difficult to service the systems. When moving from frame based video and image processing algorithms to a streaming implementation. FPGA accelerators cannot buffer a full frame during processing, due to amongst others memory bandwidth, power and latency requirements. Frameworks exist that facilitate mapping computations to FPGA (including frameworks targeting image processing), but these do not solve the frame versus stream problem. Mapping the frame-based software to a streambased hardware platform on FPGA creates the following challenges; creating a framework that moves and buffers data (in the form of image segments) between stages, causing the develop/test/optimize cycle time to increase tremendously because of synthesis.

Related Work
Accelerating Image Processing Workloads
FPGA Acceleration
Integration Frameworks
FPGA Overlays
OpenCL’s View on Parallel Computing
Private Memory – only accessible from a single execution device
Streaming Data and OpenCL
OpenCL Data Architecture
Implementation - Hardware
Processing Element
Memory Structure
DMA Unit
Debug Bus
Implementation - Software
Compilation and Operation
Buffer Management
Synchronization and Communication
Application Development
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call