Parallel implementation of two numerical tools popular in optical studies of biological materials–Inverse Adding-Doubling (IAD) program and Monte Carlo Multi-Layered (MCML) program–was developed and tested in this study. The implementation was based on Message Passing Interface (MPI) and standard C-language. Parallel versions of IAD and MCML programs were compared to their sequential counterparts in validation and performance tests. Additionally, the portability of the programs was tested using a local high performance computing (HPC) cluster, Penguin-On-Demand HPC cluster, and Amazon EC2 cluster. Parallel IAD was tested with up to 150 parallel cores using 1223 input datasets. It demonstrated linear scalability and the speedup was proportional to the number of parallel cores (up to 150x). Parallel MCML was tested with up to 1001 parallel cores using problem sizes of 104–109 photon packets. It demonstrated classical performance curves featuring communication overhead and performance saturation point. Optimal performance curve was derived for parallel MCML as a function of problem size. Typical speedup achieved for parallel MCML (up to 326x) demonstrated linear increase with problem size. Precision of MCML results was estimated in a series of tests — problem size of 106 photon packets was found optimal for calculations of total optical response and 108 photon packets for spatially-resolved results. The presented parallel versions of MCML and IAD programs are portable on multiple computing platforms. The parallel programs could significantly speed up the simulation for scientists and be utilized to their full potential in computing systems that are readily available without additional costs. Program summaryProgram title: MCMLMPI, IADMPICatalogue identifier: AEWF_v1_0Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEWF_v1_0.htmlProgram obtainable from: CPC Program Library, Queen’s University, Belfast, N. IrelandLicensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.htmlNo. of lines in distributed program, including test data, etc.: 53638No. of bytes in distributed program, including test data, etc.: 731168Distribution format: tar.gzProgramming language: C.Computer: Up to and including HPC/Cloud CPU-based clusters.Operating system: Windows, Linux, Unix, MacOS — requires ANSI C-compatible compiler.Has the code been vectorized or parallelized?: Yes, using MPI directives.RAM: From megabytes to gigabytes (MCMLMPI), kilobytes to megabytes (IADMPI)Classification: 2.2, 2.5, 18.External routines: dcmt-library (MCMLMPI), cweb-package (IADMPI)Nature of problem:Photon transport in multilayered semi-transparent material, estimation of optical properties (IADMPI) and optical response (MCMLMPI) of multilayered material samples.Solution method:Massively-parallel Monte-Carlo method (MCMLMPI), Inverse Adding-Doubling method (IADMPI)Unusual features:Tracking and analysis of photon packets in turbid media (MCMLMPI)Running time:Many small problems can be solved within seconds, large problems might take hours even on HPC clusters (MCMLMPI); hours on single computer and seconds on HPC cluster (IADMPI)
Read full abstract