Abstract

Hyperspectral imaging is a technology which, by sensing hundreds of wavelengths per pixel, enables fine studies of the captured objects. This produces great amounts of data that require equally big storage, and compression with algorithms such as the Consultative Committee for Space Data Systems (CCSDS) 1.2.3 standard is a must. However, the speed of this lossless compression algorithm is not enough in some real-time scenarios if we use a single-core processor. This is where architectures such as Field Programmable Gate Arrays (FPGAs) and Graphics Processing Units (GPUs) can shine best. In this paper, we present both FPGA and OpenCL implementations of the CCSDS 1.2.3 algorithm. The proposed paralellization method has been implemented on the Virtex-7 XC7VX690T, Virtex-5 XQR5VFX130 and Virtex-4 XC2VFX60 FPGAs, and on the GT440 and GT610 GPUs, and tested using hyperspectral data from NASA’s Airborne Visible Infra-Red Imaging Spectrometer (AVIRIS). Both approaches fulfill our real-time requirements. This paper attempts to shed some light on the comparison between both approaches, including other works from existing literature, explaining the trade-offs of each one.

Highlights

  • Parallelization of different algorithms has been a hot topic in the last years [1]

  • We study the performance (measured in MegaSamples per second (MSa/s)) of both the Field Programmable Gate Arrays (FPGAs) and OpenCL versions of the algorithm using a variety of different devices

  • We have gathered synthesis results for the Virtex-5 XQR5VFX130, the latest most powerful space-qualified FPGA, as it serves as a theoretical limit for what could be achieved with this design on space

Read more

Summary

Introduction

Parallelization of different algorithms has been a hot topic in the last years [1]. As we are already at the limits of single-threaded applications [2], new programming paradigms are necessary to process the copious amounts of data we collect today. Taking advantage of the FPGAs on board memory, we use it as a buffer for the sensor’s output as well as for storing the variables needed for compression This way, we only perform each calculation once and are able to achieve real-time compression, without the need for external RAM. This imposes limitations on the image frame size (See Figure 1) that can be processed, but not to the third dimension, which can grow indefinitely. We proceed to give a more detailed explanation: Local operations: For this step, we need the current samples we are compressing, as well as neighboring values These have already been processed since the algorithm works in a raster scan fashion. A special module is responsible for taking all the simultaneously compressed samples, and sending them to the output following the same order in which they came into the compressor

OpenCL Implementation
Experimental Results
FPGA Implementation Results
OpenCL Implementation Results
Comparison
V-7 XC7VX690T
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.