The characterization of porous media via digital testing usually relies on intensive numerical computations that can be parallelized in GPUs. For absolute permeability estimation, Stokes flow simulations are carried out at the micro-structure to recover velocity fields that are used in upscaling with Darcy’s law. Digital models of samples can be obtained via micro-computed tomography (μCT) scans. As μCT data is three-dimensional, meshes grow cubically with image dimensions, causing the numerical problem at hand to become compute- and memory-bound as either resolution improves or larger fields-of-view are considered. While the usual focus is on accelerating solvers, memory usage continues to be a significant limitation for analyses of representative volumes in relatively accessible hardware. In this work, we explore the possibility of implementing MINRES solvers in GPU that favor a reduction in memory allocation. These solvers are applied to matrix-free FEM-based permeability characterization of μCT images. Our goal is to enable the study of 10003 voxel images in single GPU machines. Implementations that only require five, three, or two n-sized vectors of variables are presented, with n being the number of unknowns. Further, we employ a mesh numbering strategy that enables node-by-node massively parallel operations within a non-monolithic voxel-based pore space without storing connectivity tables. The proposed solvers, available through the open-source chfem software, are verified against analytical models for simple three-dimensional micro-structures, then are validated against numerical Digital Petrophysics benchmarks. A consumer-grade graphics card with 12GB of RAM is employed for the characterization of images with up to roughly 540 million voxels in a matter of tens of minutes. Stokes flow FEM-based simulations in meshes with 449 million degrees-of-freedom (DOFs) are carried out in 9 to 15 min, allocating less than 10GB in global memory. Finally, simulations on three 10003 carbon fiber domains, amounting to more than 3.7 billion DOFs, were run on a high-end GPU with 80GB of RAM in under 2.5 h, achieving very close agreement with flow-tube permeability experiments.
Read full abstract