Abstract
Hyperspectral imaging is a technology which, by sensing hundreds of wavelengths per pixel, enables fine studies of the captured objects. This produces great amounts of data that require equally big storage, and compression with algorithms such as the Consultative Committee for Space Data Systems (CCSDS) 1.2.3 standard is a must. However, the speed of this lossless compression algorithm is not enough in some real-time scenarios if we use a single-core processor. This is where architectures such as Field Programmable Gate Arrays (FPGAs) and Graphics Processing Units (GPUs) can shine best. In this paper, we present both FPGA and OpenCL implementations of the CCSDS 1.2.3 algorithm. The proposed paralellization method has been implemented on the Virtex-7 XC7VX690T, Virtex-5 XQR5VFX130 and Virtex-4 XC2VFX60 FPGAs, and on the GT440 and GT610 GPUs, and tested using hyperspectral data from NASA’s Airborne Visible Infra-Red Imaging Spectrometer (AVIRIS). Both approaches fulfill our real-time requirements. This paper attempts to shed some light on the comparison between both approaches, including other works from existing literature, explaining the trade-offs of each one.
Highlights
Parallelization of different algorithms has been a hot topic in the last years [1]
We study the performance (measured in MegaSamples per second (MSa/s)) of both the Field Programmable Gate Arrays (FPGAs) and OpenCL versions of the algorithm using a variety of different devices
We have gathered synthesis results for the Virtex-5 XQR5VFX130, the latest most powerful space-qualified FPGA, as it serves as a theoretical limit for what could be achieved with this design on space
Summary
Parallelization of different algorithms has been a hot topic in the last years [1]. As we are already at the limits of single-threaded applications [2], new programming paradigms are necessary to process the copious amounts of data we collect today. Taking advantage of the FPGAs on board memory, we use it as a buffer for the sensor’s output as well as for storing the variables needed for compression This way, we only perform each calculation once and are able to achieve real-time compression, without the need for external RAM. This imposes limitations on the image frame size (See Figure 1) that can be processed, but not to the third dimension, which can grow indefinitely. We proceed to give a more detailed explanation: Local operations: For this step, we need the current samples we are compressing, as well as neighboring values These have already been processed since the algorithm works in a raster scan fashion. A special module is responsible for taking all the simultaneously compressed samples, and sending them to the output following the same order in which they came into the compressor
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.