Abstract

Statistical iterative reconstruction (SIR) algorithms have shown great potential for improving image quality in reduced and low dose X-ray computed tomography (CT). However, high computational cost and long reconstruction times have so far prevented the use of SIR in practical applications. Various optimization algorithms have been proposed to make SIR parallelizable for execution on multicore computational platforms, whereas others have sought to improve its convergence rate. Parallelizing on a set of decoupled voxels within an iterative coordinate descent (ICD) optimization framework has shown good promise to achieve both of these premises. However, so far these types of frameworks come at the price of additional complexities or are limited to parallel beam geometry only. We improve on this prior research and present a framework, which also achieves parallelism by processing sets of independent voxels, but does not introduce additional complexities and has no restrictions on beam geometry. Our method uses a novel multivoxel update (MVU) scheme within a general ICD framework fully optimized for acceleration on commodity GPUs. We also investigate different GPU memory access patterns to increase cache hit-rates that result in improved time performance in the ICD framework. Experiments demonstrate speedups of two orders of magnitude for clinical datasets in cone-beam CT geometry, compared to the single-voxel update scheme native to conventional ICD-based SIR. Finally, since our MVU scheme operates on fully independent voxels, it maintains the fast convergence properties of ICD-based SIR. Consequently, the speedups achieved by parallel computing are not diminished by slower convergence of the iterative updates or by any additional overhead to decouple conflicting voxels.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call