In this article, a new version of the Real-time Image Stream Algorithms (RISA) data processing suite is introduced. It now features online detector data acquisition, high-throughput data dumping and enhanced real-time data processing capabilities. The achieved low-latency real-time data processing extends the application of ultrafast electron beam X-ray computed tomography (UFXCT) scanners to real-time scanner control and process control. We implemented high performance data packet reception based on data plane development kit (DPDK) and high-throughput data storing using both hierarchical data format version 5 (HDF5) as well as the adaptable input/output system version 2 (ADIOS2). Furthermore, we extended RISA's underlying pipelining framework to support the fork-join paradigm. This allows for more complex workflows as it is necessary, e.g. for online data processing. Also, the pipeline configuration is moved from compile-time to runtime, i.e. processing stages and their interconnections can now be configured using a configuration file. In several benchmarks, RISA is profiled regarding data acquisition performance, data storage throughput and overall processing latency. We found that using direct IO mode significantly improves data writing performance on the local data storage. We could further prove that RISA is now capable of concurrently receiving, processing and storing data from up to 768 detector channels (3072 MB/s) at 8000 fps on a single-GPU computer in real-time. Program summaryProgram Title: GLADOS/RISACPC Library link to program files:https://doi.org/10.17632/65sx747rvm.2Developer's repository link:https://codebase.helmholtz.cloud/risaLicensing provisions: Apache-2.0Programming language: C++Journal reference of previous version: Comput. Phys. Commun. 219 (2017) 353-360 [1]Does the new version supersede the previous version?: Yes.Reasons for the new version: Extended capabilities for real-time operation with latest UFXCT hardware.Summary of revisions: (i) Add forking and joining of processing pipeline branches(ii) Add runtime (re-)configuration of pipeline stages and connections(iii) Add UDP receiver stage to acquire detector data in real-time(iv) Add high-throughput data dumpingNature of problem: Ultrafast electron beam X-ray computed tomography scanners stream multiple Gigabytes of raw data per second via Ethernet to a control computer. Receiving the data with low latency, real-time image-based control would become possible. For this, data need to be captured from the network, stored on disk, reconstructed and post-processed concurrently. The current total data rate of up to 3072 MB/s requires high-throughput solutions for each of these tasks.Solution method: Using a pipeline scheme, RISA processes incoming raw data in distinct stages (sources, processors, sinks). These are implemented in GPU kernels and are executed concurrently to exploit data parallelism as well as task parallelism. To capture detector data, we implemented a UDP packet capturing stage based on DPDK [2] which acts as a source stage. By allowing the pipeline to fork up into multiple branches, we concurrently acquire, store and process the data. For storing these data, we use the HDF5 format [3]. We achieve the required data rates by writing in direct IO mode onto an SSD array in RAID 0 configuration.Additional comments including restrictions and unusual features: RISA provides a set of general-purpose processing stages which are suitable for generic image stream processing.
Read full abstract