Multiple Graphics Processing Units Research Articles

Continuous advancements in scientific and engineering understanding of earthquake phenomena, combined with the associated development of representative physics-based models, is providing a foundation for high-performance, fault-to-structure earthquake simulations. However, regional-scale applications of high-performance models have been challenged by the computational requirements at the resolutions required for engineering risk assessments. The EarthQuake SIMulation (EQSIM) framework, a software application development under the US Department of Energy (DOE) Exascale Computing Project, is focused on overcoming the existing computational barriers and enabling routine regional-scale simulations at resolutions relevant to a breadth of engineered systems. This multidisciplinary software development—drawing upon expertise in geophysics, engineering, applied math and computer science—is preparing the advanced computational workflow necessary to fully exploit the DOE’s exaflop computer platforms coming online in the 2023 to 2024 timeframe. Achievement of the computational performance required for high-resolution regional models containing upward of hundreds of billions to trillions of model grid points requires numerical efficiency in every phase of a regional simulation. This includes run time start-up and regional model generation, effective distribution of the computational workload across thousands of computer nodes, efficient coupling of regional geophysics and local engineering models, and application-tailored highly efficient transfer, storage, and interrogation of very large volumes of simulation data. This article summarizes the most recent advancements and refinements incorporated in the workflow design for the EQSIM integrated fault-to-structure framework, which are based on extensive numerical testing across multiple graphics processing unit (GPU)-accelerated platforms, and demonstrates the computational performance achieved on the world’s first exaflop computer platform through representative regional-scale earthquake simulations for the San Francisco Bay Area in California, USA.

Context. Autonomous vehicles are becoming increasingly popular, and one of the important modern challenges in their development is ensuring their effective navigation in space and movement within designated lanes. This paper examines a method of spatial orientation for vehicles using computer vision and artificial neural networks. The research focused on the navigation system of an autonomous vehicle, which incorporates the use of modern distributed and parallel computing technologies. Objective. The aim of this work is to enhance modern autonomous vehicle navigation algorithms through parallel training of artificial neural networks and to determine the optimal combination of technologies and nodes of devices to increase speed and enable real-time decision-making capabilities in spatial navigation for autonomous vehicles. Method. The research establishes that the utilization of computer vision and neural networks for road lane segmentation proves to be an effective method for spatial orientation of autonomous vehicles. For multi-core computing systems, the application of parallel programming technology, OpenMP, for neural network training on processors with varying numbers of parallel threads increases the algorithm’s execution speed. However, the use of CUDA technology for neural network training on a graphics processing unit significantly enhances prediction speeds compared to OpenMP. Additionally, the feasibility of employing PyTorch Distributed Data Parallel (DDP) technology for training the neural network across multiple graphics processing units (nodes) simultaneously was explored. This approach further improved prediction execution times compared to using a single graphics processing unit. Results. An algorithm for training and prediction of an artificial neural network was developed using two independent nodes, each equipped with separate graphics processing units, and their synchronization for exchanging training results after each epoch, employing PyTorch Distributed Data Parallel (DDP) technology. This approach allows for scalable computations across a higher number of resources, significantly expediting the model training process. Conclusions. The conducted experiments have affirmed the effectiveness of the proposed algorithm, warranting the recommendation of this research for further advancement in autonomous vehicles and enhancement of their navigational capabilities. Notably, the research outcomes can find applications in various domains, encompassing automotive manufacturing, logistics, and urban transportation infrastructure. The obtained results are expected to assist future researchers in understanding the most efficient hardware and software resources to employ for implementing AI-based navigation systems in autonomous vehicles. Prospects for future investigations may encompass refining the accuracy of the proposed parallel algorithm without compromising its efficiency metrics. Furthermore, there is potential for experimental exploration of the proposed algorithm in more intricate practical scenarios of diverse nature and dimensions.

Multiple Graphics Processing Units Research Articles

Related Topics

Articles published on Multiple Graphics Processing Units

Multi-GPU RI-HF Energies and Analytic Gradients─Toward High-Throughput Ab Initio Molecular Dynamics.

Multi-GPU 3D k-nearest neighbors computation with application to ICP, point cloud smoothing and normals computation

Mother-leaf-method accelerated parallel-GPU AMR phase-field simulations of dendrite growth

Regional-scale fault-to-structure earthquake simulations with the EQSIM framework: Workflow maturation and computational performance on GPU-accelerated exascale platforms

Multi-phase-field lattice Boltzmann simulations of semi-solid simple shear deformation in thin film

Localization of overdamped bosonic modes and transport in strange metals

Iterative Reconstruction of Micro Computed Tomography Scans Using Multiple Heterogeneous GPUs.

High-Performance Multi-GPU Analytic RI-MP2 Energy Gradients.

Multi-Device Parallel MRI Reconstruction: Efficient Partitioning for Undersampled 5D Cardiac CINE.

PARALLEL AND DISTRIBUTED COMPUTING TECHNOLOGIES FOR AUTONOMOUS VEHICLE NAVIGATION

GPU-based transient analysis of modern grids deploying a hybrid DDM algorithm

Accelerating micromagnetic and atomistic simulations using multiple GPUs

H-Analysis and data-parallel physics-informed neural networks

Vectorized Matrix Formulation for Sound Power Determination Through Inverse Problems and Regularization Techniques

The implementation of the three-dimensional unified gas-kinetic wave-particle method on multiple graphics processing units

A Multi-GPU Aggregation-Based AMG Preconditioner for Iterative Linear Solvers

Adopting GPU computing to support DL-based Earth science applications

Accelerating 4D image reconstruction for magnetic resonance-guided radiotherapy.

Phase-field lattice Boltzmann simulation of three-dimensional settling dendrite with natural convection during nonisothermal solidification of binary alloy

RTGPU: Real-Time GPU Scheduling of Hard Deadline Parallel Tasks With Fine-Grain Utilization

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Multiple Graphics Processing Units Research Articles

Related Topics

Articles published on Multiple Graphics Processing Units

Multi-GPU RI-HF Energies and Analytic Gradients─Toward High-Throughput Ab Initio Molecular Dynamics.

Multi-GPU 3D k-nearest neighbors computation with application to ICP, point cloud smoothing and normals computation

Mother-leaf-method accelerated parallel-GPU AMR phase-field simulations of dendrite growth

Regional-scale fault-to-structure earthquake simulations with the EQSIM framework: Workflow maturation and computational performance on GPU-accelerated exascale platforms

Multi-phase-field lattice Boltzmann simulations of semi-solid simple shear deformation in thin film

Localization of overdamped bosonic modes and transport in strange metals

Iterative Reconstruction of Micro Computed Tomography Scans Using Multiple Heterogeneous GPUs.

High-Performance Multi-GPU Analytic RI-MP2 Energy Gradients.

Multi-Device Parallel MRI Reconstruction: Efficient Partitioning for Undersampled 5D Cardiac CINE.

PARALLEL AND DISTRIBUTED COMPUTING TECHNOLOGIES FOR AUTONOMOUS VEHICLE NAVIGATION

GPU-based transient analysis of modern grids deploying a hybrid DDM algorithm

Accelerating micromagnetic and atomistic simulations using multiple GPUs

H-Analysis and data-parallel physics-informed neural networks

Vectorized Matrix Formulation for Sound Power Determination Through Inverse Problems and Regularization Techniques

The implementation of the three-dimensional unified gas-kinetic wave-particle method on multiple graphics processing units

A Multi-GPU Aggregation-Based AMG Preconditioner for Iterative Linear Solvers

Adopting GPU computing to support DL-based Earth science applications

Accelerating 4D image reconstruction for magnetic resonance-guided radiotherapy.

Phase-field lattice Boltzmann simulation of three-dimensional settling dendrite with natural convection during nonisothermal solidification of binary alloy

RTGPU: Real-Time GPU Scheduling of Hard Deadline Parallel Tasks With Fine-Grain Utilization