High Performance Computing Platforms Research Articles

We propose a new data partitioning approach to improve the performance of heterogeneous parallel applications in modern high-performance computing (HPC) systems. Existing approaches do not consider an important aspect that has a critical impact on the performance of parallel applications: the method of assigning partitions to each processor so as to minimize the communication cost and hence minimize data movement, which dominates energy and performance cost. Such an aspect for managing data locality is important for a large range of applications. Therefore, to achieve efficient data partitioning, we propose a method for distribution considering this aspect. Our algorithm seeks to minimize execution time by using two models. The first is a fine-grained computational model of heterogeneous processors, which is sufficiently adequate and accurate to guarantee efficient partitioning results that maximize utilization. The second is a communication model of heterogeneous processors to minimize data motion and hide communication overheads. The correctness of our algorithm was analyzed and validated. The complexity of our algorithm is approximately of order O ( p × log s + p × s 2 ), where s is problem size/steps (where steps is the step size between data points in the computational model of each processor), and p is the number of heterogeneous processors. The experiments were performed on AZIZ supercomputer using two types of applications: an application with no dependency between its partitions, i.e., matrix multiplication, and another one with high dependency between its partitions, i.e., the Jacobi method. The results show the efficiency of our algorithm in improving performance.

Read full abstract

Structurally similar analogues of given query compounds can be rapidly retrieved from chemical databases by the molecular similarity search approaches. However, the computational cost associated with the exhaustive similarity search of a large compound database will be quite high. Although the latest indexing algorithms can greatly speed up the search process, they cannot be readily applicable to molecular similarity search problems due to the lack of Tanimoto similarity metric implementation. In this paper, we first implement Python or C++ codes to enable the Tanimoto similarity search via several recent indexing algorithms, such as Hnsw and Onng. Moreover, there are increasing interests in computational communities to develop robust benchmarking systems to access the performance of various computational algorithms. Here, we provide a benchmark to evaluate the molecular similarity searching performance of these recent indexing algorithms. To avoid the potential package dependency issues, two separate benchmarks are built based on currently popular container technologies, Docker and Singularity. The Singularity container is a rather new container framework specifically designed for the high-performance computing (HPC) platform and does not need the privileged permissions or the separated daemon process. Both benchmarking methods are extensible to incorporate other new indexing algorithms, benchmarking data sets, and different customized parameter settings. Our results demonstrate that the graph-based methods, such as Hnsw and Onng, consistently achieve the best trade-off between searching effectiveness and searching efficiencies. The source code of the entire benchmark systems can be downloaded from https://github.uconn.edu/mldrugdiscovery/MssBenchmark.

Read full abstract

High Performance Computing Platforms Research Articles

Related Topics

Articles published on High Performance Computing Platforms

High-performance computing for SARS-CoV-2 RNAs clustering: a data science‒based genomics approach.

GShare: A centralized GPU memory management framework to enable GPU memory sharing for containers

FSEI-GPU: GPU accelerated simulations of the fluid–structure–electrophysiology interaction in the left heart

It’s Time to Talk About HPC Storage: Perspectives on the Past and Future

Energy-efficient algebra kernels in FPGA for High Performance Computing

Fifteen quick tips for success with HPC, i.e., responsibly BASHing that Linux cluster.

Extension of the NEAMS workbench to parallel sensitivity and uncertainty analysis of thermal hydraulic parameters using Dakota and Nek5000

High Performance Computing in Parallel Electromagnetics Simulation Code suite ACE3P

Dynamic Workflow Engine of Atmospheric Big Remote Sensing Data Processing Powered by Heterogenous Platform for High Performance Computing

Resilient Scheduling Heuristics for Rigid Parallel Jobs

A Proposed Data Partitioning Approach on Heterogeneous HPC Platforms: Data Locality Perspective

Rare-Earth Metal Boroxide with Formal Triple Metal–Oxygen Orbital Interaction: Synthesis from B(C 6 F 5 ) 3 ·H 2 O and Radical-Anion Ligated Rare-Earth Metal Amides

Many-Body Quantum Chemistry on Massively Parallel Computers.

EQSIM—A multidisciplinary framework for fault-to-structure earthquake simulations on exascale computers part I: Computational models and workflow

Calculation of Feynman loop integration and phase-space integration via auxiliary mass flow * *Supported in part by the National Natural Science Foundation of China (11875071, 11975029) and the High-performance Computing Platform of Peking University

Extensible and Scalable Adaptive Sampling on Supercomputers.

Improved probabilistic I/O scheduling for limited-size Burst-Buffers deployed HPC

Benchmark on Indexing Algorithms for Accelerating Molecular Similarity Search.

Convergence of artificial intelligence and high performance computing on NSF-supported cyberinfrastructure

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

High Performance Computing Platforms Research Articles

Related Topics

Articles published on High Performance Computing Platforms

High-performance computing for SARS-CoV-2 RNAs clustering: a data science‒based genomics approach.

GShare: A centralized GPU memory management framework to enable GPU memory sharing for containers

FSEI-GPU: GPU accelerated simulations of the fluid–structure–electrophysiology interaction in the left heart

It’s Time to Talk About HPC Storage: Perspectives on the Past and Future

Energy-efficient algebra kernels in FPGA for High Performance Computing

Fifteen quick tips for success with HPC, i.e., responsibly BASHing that Linux cluster.

Extension of the NEAMS workbench to parallel sensitivity and uncertainty analysis of thermal hydraulic parameters using Dakota and Nek5000

High Performance Computing in Parallel Electromagnetics Simulation Code suite ACE3P

Dynamic Workflow Engine of Atmospheric Big Remote Sensing Data Processing Powered by Heterogenous Platform for High Performance Computing

Resilient Scheduling Heuristics for Rigid Parallel Jobs

A Proposed Data Partitioning Approach on Heterogeneous HPC Platforms: Data Locality Perspective

Rare-Earth Metal Boroxide with Formal Triple Metal–Oxygen Orbital Interaction: Synthesis from B(C 6 F 5 ) 3 ·H 2 O and Radical-Anion Ligated Rare-Earth Metal Amides

Many-Body Quantum Chemistry on Massively Parallel Computers.

EQSIM—A multidisciplinary framework for fault-to-structure earthquake simulations on exascale computers part I: Computational models and workflow

Calculation of Feynman loop integration and phase-space integration via auxiliary mass flow * *Supported in part by the National Natural Science Foundation of China (11875071, 11975029) and the High-performance Computing Platform of Peking University

Extensible and Scalable Adaptive Sampling on Supercomputers.

Improved probabilistic I/O scheduling for limited-size Burst-Buffers deployed HPC

Benchmark on Indexing Algorithms for Accelerating Molecular Similarity Search.

Convergence of artificial intelligence and high performance computing on NSF-supported cyberinfrastructure