Abstract

This paper deals with parallelization and implementation aspects of partial differential equation (PDE)-based image processing models for large cluster environments with distributed memory. As an example we focus on nonlinear diffusion filtering which we discretize by means of an additive operator splitting (AOS). We start by decomposing the algorithm into small modules that shall be parallelized separately. For this purpose image partitioning strategies are discussed and their impact on the communication pattern and volume is analyzed. Based on the results we develop an algorithmic implementation with excellent scaling properties on massively connected low-latency networks. Test runs on two different high-end Myrinet clusters yield almost linear speedup factors up to 209 for 256 processors. This results in typical denoising times of 0.4 s for five iterations on a 256×256×128 data cube.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call