Graphics Processing Units Optimization Research Articles

We are developing a new technique for monitoring portal hypertension by the pressure gradient between the portal vein and the inferior vena cava (PPG) based on non-invasive measurements (MRI images). Massive parametrization and classification are required to investigate the underlying relationship between the porosity and the stages of liver cirrhosis numerically, and the hepatic-portal venous system is a multi-scale system. Both of them need high computational costs. The suitability of the lattice Boltzmann method for GPU (Graphics Processing Unit) parallel computation provides an opportunity to overcome it. In this paper, we perform GPU parallelization and optimization for the volumetric lattice Boltzmann method with arbitrary geometry based on images. Three application cases, including pipe flow, hemodynamics in the portal venous system, and hemodynamics in a simple hepatic-portal venous system, are employed to prove the method can be applied in the hepatic-portal venous system based on accuracy and efficiency. The reliability of the model is qualitatively validated by the analytical solution of velocity and pressure difference distribution of pipe flow and quantitatively confirmed by the pulsatility of velocity and pressure difference that can be neglected in the portal venous system. The performance of the application cases is examined with Intel Broadwell E5-2683 v3@ 2.30 GHz (CPU) and NVIDIA Tesla V100 16GB (GPU). It shows the GPU algorithm for sparse geometry (SPARSE) has a similar speed to the regular GPU algorithm for dense geometry (DENSE) when the fluid volume fraction (q) is close to 1. And SPARSE speeds up to 2.2 times compared with DENSE when q is in the range of 0.19∼0.27. Meanwhile, the saving ratio of memory cost depends on q. For a numerical case in the hepatic-portal venous system, i.e., a large-scale system, parallel execution can be converged around half an hour with SPARSE, while the memory spills the limitation with DENSE with a single GPU. Hence, multi-GPU implementation is applied to release the limitation, and it can improve performance by increasing the number of GPU cards. In summary, the method presented in the paper is feasibility applied in the hepatic-portal venous system, laying the foundation of the new technique development.

Read full abstract

We present Sailfish, an open source fluid simulation package implementing the lattice Boltzmann method (LBM) on modern Graphics Processing Units (GPUs) using CUDA/OpenCL. We take a novel approach to GPU code implementation and use run-time code generation techniques and a high level programming language (Python) to achieve state of the art performance, while allowing easy experimentation with different LBM models and tuning for various types of hardware. We discuss the general design principles of the code, scaling to multiple GPUs in a distributed environment, as well as the GPU implementation and optimization of many different LBM models, both single component (BGK, MRT, ELBM) and multicomponent (Shan–Chen, free energy). The paper also presents results of performance benchmarks spanning the last three NVIDIA GPU generations (Tesla, Fermi, Kepler), which we hope will be useful for researchers working with this type of hardware and similar codes. Program SummaryProgram title: SailfishCatalogue identifier: AETA_v1_0Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AETA_v1_0.htmlProgram obtainable from: CPC Program Library, Queen’s University, Belfast, N. IrelandLicensing provisions: GNU Lesser General Public License, version 3No. of lines in distributed program, including test data, etc.: 225864No. of bytes in distributed program, including test data, etc.: 46861049Distribution format: tar.gzProgramming language: Python, CUDA C, OpenCL.Computer: Any with an OpenCL or CUDA-compliant GPU.Operating system: No limits (tested on Linux and Mac OS X).RAM: Hundreds of megabytes to tens of gigabytes for typical cases.Classification: 12, 6.5.External routines: PyCUDA/PyOpenCL, Numpy, Mako, ZeroMQ (for multi-GPU simulations), scipy, sympyNature of problem:GPU-accelerated simulation of single- and multi-component fluid flows.Solution method:A wide range of relaxation models (LBGK, MRT, regularized LB, ELBM, Shan–Chen, free energy, free surface) and boundary conditions within the lattice Boltzmann method framework. Simulations can be run in single or double precision using one or more GPUs.Restrictions:The lattice Boltzmann method works for low Mach number flows only.Unusual features:The actual numerical calculations run exclusively on GPUs. The numerical code is built dynamically at run-time in CUDA C or OpenCL, using templates and symbolic formulas. The high-level control of the simulation is maintained by a Python process.Additional comments:!!!!! The distribution file for this program is over 45 Mbytes and therefore is not delivered directly when Download or Email is requested. Instead a html file giving details of how the program can be obtained is sent. !!!!!Running time:Problem-dependent, typically minutes (for small cases or short simulations) to hours (large cases or long simulations).

Read full abstract

Graphics Processing Units Optimization Research Articles

Related Topics

Articles published on Graphics Processing Units Optimization

GPU optimization techniques to accelerate optiGAN—a particle simulation GAN

GPU accelerated volumetric lattice Boltzmann model for image-based hemodynamics in portal hypertension

Mining Cryptocurrency-Based Security Using Renewable Energy as Source

GPU-DAEMON: GPU algorithm design, data management & optimization template for array based big omics data

Computational Benefit of GPU Optimization for the Atmospheric Chemistry Modeling

GPU Accelerated Multilevel Lagrangian Carotid Strain Imaging.

Caffe CNN-based classification of hyperspectral images on GPU

A Fast Discrete Wavelet Transform Using Hybrid Parallelism on GPUs

Solving optimization problems using a hybrid systolic search on GPU plus CPU

High performance GPU based optimized feature matching for computer vision applications

Parallel Implementation of Sparse Representation Classifiers for Hyperspectral Imagery on GPUs

GPU Optimized Stereo Image Matching Technique for Computer Vision Applications

Efficient Parallel GPU Design on WRF Five-Layer Thermal Diffusion Scheme

GPU parallelization of unstructured/hybrid grid ALE multigrid unsteady solver for moving body problems

Sailfish: A flexible multi-GPU implementation of the lattice Boltzmann method

Multi-dimensional characterization of electrostatic surface potential computation on graphics processors.

Robust Parallel Preconditioned Power Grid Simulation on GPU With Adaptive Runtime Performance Modeling and Optimization

Fast convolution‐superposition dose calculation on graphics hardware

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Graphics Processing Units Optimization Research Articles

Related Topics

Articles published on Graphics Processing Units Optimization

GPU optimization techniques to accelerate optiGAN—a particle simulation GAN

GPU accelerated volumetric lattice Boltzmann model for image-based hemodynamics in portal hypertension

Mining Cryptocurrency-Based Security Using Renewable Energy as Source

GPU-DAEMON: GPU algorithm design, data management & optimization template for array based big omics data

Computational Benefit of GPU Optimization for the Atmospheric Chemistry Modeling

GPU Accelerated Multilevel Lagrangian Carotid Strain Imaging.

Caffe CNN-based classification of hyperspectral images on GPU

A Fast Discrete Wavelet Transform Using Hybrid Parallelism on GPUs

Solving optimization problems using a hybrid systolic search on GPU plus CPU

High performance GPU based optimized feature matching for computer vision applications

Parallel Implementation of Sparse Representation Classifiers for Hyperspectral Imagery on GPUs

GPU Optimized Stereo Image Matching Technique for Computer Vision Applications

Efficient Parallel GPU Design on WRF Five-Layer Thermal Diffusion Scheme

GPU parallelization of unstructured/hybrid grid ALE multigrid unsteady solver for moving body problems

Sailfish: A flexible multi-GPU implementation of the lattice Boltzmann method

Multi-dimensional characterization of electrostatic surface potential computation on graphics processors.

Robust Parallel Preconditioned Power Grid Simulation on GPU With Adaptive Runtime Performance Modeling and Optimization

Fast convolution‐superposition dose calculation on graphics hardware