Efficient methods for implementation of multi-level nonrigid mass-preserving image registration on GPUs and multi-threaded CPUs

Nathan D Ellingwood,Youbing Yin,Matthew Smith,Ching-Long Lin

doi:10.1016/j.cmpb.2015.12.018

Abstract

Background and objectiveFaster and more accurate methods for registration of images are important for research involved in conducting population-based studies that utilize medical imaging, as well as improvements for use in clinical applications. We present a novel computation- and memory-efficient multi-level method on graphics processing units (GPU) for performing registration of two computed tomography (CT) volumetric lung images. MethodsWe developed a computation- and memory-efficient Diffeomorphic Multi-level B-Spline Transform Composite (DMTC) method to implement nonrigid mass-preserving registration of two CT lung images on GPU. The framework consists of a hierarchy of B-Spline control grids of increasing resolution. A similarity criterion known as the sum of squared tissue volume difference (SSTVD) was adopted to preserve lung tissue mass. The use of SSTVD consists of the calculation of the tissue volume, the Jacobian, and their derivatives, which makes its implementation on GPU challenging due to memory constraints. The use of the DMTC method enabled reduced computation and memory storage of variables with minimal communication between GPU and Central Processing Unit (CPU) due to ability to pre-compute values. The method was assessed on six healthy human subjects. ResultsResultant GPU-generated displacement fields were compared against the previously validated CPU counterpart fields, showing good agreement with an average normalized root mean square error (nRMS) of 0.044±0.015. Runtime and performance speedup are compared between single-threaded CPU, multi-threaded CPU, and GPU algorithms. Best performance speedup occurs at the highest resolution in the GPU implementation for the SSTVD cost and cost gradient computations, with a speedup of 112 times that of the single-threaded CPU version and 11 times over the twelve-threaded version when considering average time per iteration using a Nvidia Tesla K20X GPU. ConclusionsThe proposed GPU-based DMTC method outperforms its multi-threaded CPU version in terms of runtime. Total registration time reduced runtime to 2.9min on the GPU version, compared to 12.8min on twelve-threaded CPU version and 112.5min on a single-threaded CPU. Furthermore, the GPU implementation discussed in this work can be adapted for use of other cost functions that require calculation of the first derivatives.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Efficient methods for implementation of multi-level nonrigid mass-preserving image registration on GPUs and multi-threaded CPUs

Abstract

Talk to us

Similar Papers

More From: Computer Methods and Programs in Biomedicine

Lead the way for us

Journal: Computer Methods and Programs in Biomedicine	Publication Date: Jan 6, 2016
Citations: 65

Similar Papers

Seamless GPU Evaluation of Smart Expression Templates
Baptiste Wicht ... Jean Hennebert
-
Baptiste Wicht, et. al.Baptiste Wicht ... Jean Hennebert
01 Jul 2018
01 Jul 2018

Importance of explicit vectorization for CPU and GPU software performance
Neil G Dickson ... Firas Hamze
Journal of Computational Physics | VOL. 230
Neil G Dickson, et. al.Neil G Dickson ... Firas Hamze
29 Mar 2011
Journal of Computational Physics | VOL. 230

Comparing Different Approaches for Solving Large Scale Power-Flow Problems With the Newton-Raphson Method
Manolo D'Orto ... Svante Sjoblom
IEEE Access | VOL. 9
Manolo D'Orto, et. al.Manolo D'Orto ... Svante Sjoblom
01 Jan 2020
IEEE Access | VOL. 9

Implementation of GPU accelerated SPECT reconstruction with Monte Carlo-based scatter correction.
Tobias Bexelius ... Antti Sohlberg
Annals of nuclear medicine | VOL. 32
Tobias Bexelius, et. al.Tobias Bexelius ... Antti Sohlberg
21 Mar 2018
Annals of nuclear medicine | VOL. 32

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficient methods for implementation of multi-level nonrigid mass-preserving image registration on GPUs and multi-threaded CPUs

Abstract

Talk to us

Similar Papers

More From: Computer Methods and Programs in Biomedicine