Abstract

The major requirements of a good tomographic reconstruction algorithm are a reduction in radiation dosage, accurate reconstruction, detail enhancement, and rapid reconstruction time. Some of these factors are covered by many algorithms but are not collectively addressed in one. While the Maximum Likelihood Expectation Maximization (MLEM) algorithm fares well on many of these factors, it is difficult to apply this algorithm in real-time due to its long execution time. Our predetermined goal is to reduce the execution time to a large extent so that the MLEM’s advantages can be leveraged by using hardware accelerators such as Field Programmable Gate Arrays (FPGA). The FPGAs are becoming especially popular as hardware accelerators and are well known for their programmability, configurability, and massive parallelism through a large number of Configurable Logic Blocks (CLBs). Although FPGAs are extremely versatile, they require complex languages like Verilog or VHDL to program them, incorporating changes in the design level at a later stage in FPGAs demands increased effort. Here, in this paper, for the first time, we present a parallel structure for hardware acceleration of the MLEM on the mammoth Virtex 7 $VC$ 709 FPGA. Using available tools, we also present a programming flow to design the algorithmic acceleration hardware architecture. The proposed flow does not require prior knowledge of the traditionally cumbersome Hardware Description Languages (HDLs) and this makes post design changes very easy to incorporate and validate. The parallel architecture is implemented on an FPGA operating at 220 MHz and we have achieved a $288\times $ performance compared to an optimized software execution on an Intel Xeon workstation with 12-cores 3.1 GHz 32 GB RAM and 12 MB Cache architecture.

Highlights

  • Tomographic image reconstruction techniques such as computerized tomography (CT) and single photon emission computed tomography (SPECT) require high-performance computing solutions that augur the need for rapid access to medical imaging during critical healthcare procedures [1]

  • Hardware accelerators are used to accelerate reconstruction algorithms that are computationally intensive and time consuming. Algorithms such as Maximum Likelihood Expectation Maximization (MLEM) and Ordered Subset Expectation Maximization (OSEM) produce images of better quality, [2], [3], higher spatial resolution [4] and accurate geometric reconstruction [5], The Radon Transform-based Filtered Back Projection (FBP) algorithm is widely used in practice

  • The complexity in MLEM algorithm arises primarily due to the calculation of number of projections of the raw data from the detector and estimating the maximized likelihood function repeatedly (50 iterations) to arrive at the converged value which involves a lot of Multiplication-Accumulation (MAC) operations

Read more

Summary

Introduction

Tomographic image reconstruction techniques such as computerized tomography (CT) and single photon emission computed tomography (SPECT) require high-performance computing solutions that augur the need for rapid access to medical imaging during critical healthcare procedures [1]. Hardware accelerators are used to accelerate reconstruction algorithms that are computationally intensive and time consuming. Algorithms such as Maximum Likelihood Expectation Maximization (MLEM) and Ordered Subset Expectation Maximization (OSEM) produce images of better quality, [2], [3], higher spatial resolution [4] and accurate geometric reconstruction [5], The Radon Transform-based Filtered Back Projection (FBP) algorithm is widely used in practice. M. Ravi et al.: FPGA as a Hardware Accelerator for Computation Intensive Maximum Likelihood Expectation as it requires less computing power and less execution time in tomographic imaging [6]. Some groups have tried accelerating by efficiently distributing the work amongst the many parallel cores [20]–[22]. Groups like [23], [24] have implemented thread divergence strategies, prior fetch of data [18], reduction of data movement [25], using half precision floating point to reduce data transfer rate [26]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call