Sunway TaihuLight Research Articles

In computational biology, biological database search has been playing a very important role. Since the COVID-19 outbreak, it has provided significant help in identifying common characteristics of viruses and developing vaccines and drugs. Sequence alignment, a method finding similarity, homology and other information between gene/protein sequences, is the usual tool in the database search. With the explosive growth of biological databases, the search process has become extremely time-consuming. However, existing parallel sequence alignment algorithms cannot deliver efficient database search due to low utilization of the resources such as cache memory and performance issues such as load imbalance and high communication overhead. In this paper, we propose an efficient sequence alignment algorithm on Sunway TaihuLight, called ESA, for biological database search. ESA adopts a novel hybrid alignment algorithm combining local and global alignments, which has higher accuracy than other sequence alignment algorithms. Further, ESA has several optimizations including cache-aware sequence alignment, capacity-aware load balancing and bandwidth-aware data transfer. They are implemented in a heterogeneous processor SW26010 adopted in the world’s 6th fastest supercomputer, Sunway TaihuLight. The implementation of ESA is evaluated with the Swiss-Prot database on Sunway TaihuLight and other platforms. Our experimental results show that ESA has a speedup of 34.5 on a single core group (with 65 cores) of Sunway TaihuLight. The strong and weak scalabilities of ESA are tested with 1 to 1024 core groups of Sunway TaihuLight. The results show that ESA has linear weak scalability and very impressive strong scalability. For strong scalability, ESA achieves a speedup of 338.04 with 1024 core groups compared with a single core group. We also show that our proposed optimizations are also applicable to GPU, Intel multicore processors, and heterogeneous computing platforms.

Read full abstract

Molecular docking is the process of posing, scoring, and ranking small molecules at the binding sites of proteins to prioritize compounds for experimental testing. It is a widely-used computational method in the drug discovery process. However, it is a highly time-consuming procedure since a receptor may need to find favorable ligand orientations in billions of ligands. UCSF DOCK3.7 is one of the most widely used molecular docking applications. In this paper, we port and optimize UCSF DOCK3.7 on the Sunway TaihuLight supercomputer. To avoid the impact of load imbalance, we employ a producer-consumer strategy that can overlap I/O and computation in order to achieve high performance. Furthermore, we present a new binary file format to replace the mol2db2 file format for ligand storage and adopt xzip rather than gzip to compress ligand files. We show that our file format can reduce I/O time significantly while xzip saves significant storage. For the routines which determine the orientation of a ligand relative to the receptor, we present an improved algorithm to discard geometrically similar orientations. Furthermore, we fuse loops and compress memory usage to store data in fast Local Device Memory (LDM) in order to score ligand orientations with high efficiency. In addition, we propose a number of architecture-specific optimizations. Asynchronous data transfer and vectorization of computation are implemented to take full advantage of the SW26010 processor. Our experiments show that a speedup of 167 can be achieved by using the proposed strategies. Compared to a core of an Intel(R) Core(TM) i9-10900K CPU, our approach achieves speedups of 15 on a SW26010 core group. Furthermore, our implementation achieves strong scalability to hundreds of thousands of heterogeneous cores on the next-generation Sunway supercomputer.

Read full abstract

Sunway TaihuLight Research Articles

Articles published on Sunway TaihuLight

Implementation and optimisation of the cdugksFoam solver on the Sunway TaihuLight supercomputer

Massively parallel simulations of multi-stage compressors on Sunway TaihuLight

ESA: An efficient sequence alignment algorithm for biological database search on Sunway TaihuLight

Direct simulation of flow field around SUBOFF in grid-generated turbulence with SWLBM

Bio-ESMD: A Data Centric Implementation for Large-Scale Biological System Simulation on Sunway TaihuLight Supercomputer

Large-Scale Simulation of Full Three-Dimensional Flow and Combustion of an Aero-Turbofan Engine on Sunway TaihuLight Supercomputer

AGCM-3DLF: Accelerating Atmospheric General Circulation Model via 3-D Parallelization and Leap-Format

SwPHoToNs: Toward trillion‐body‐scale cosmologicalN‐body simulations on SunwayTaihuLightsupercomputer

End-to-end I/O Monitoring on Leading Supercomputers

Redesigning and Optimizing UCSF DOCK3.7 on Sunway TaihuLight

SwSpAMM: optimizing large-scale sparse approximate matrix multiplication on Sunway Taihulight

Benchmarking 50-Photon Gaussian Boson Sampling on the Sunway TaihuLight

FgSpMSpV: A Fine-grained Parallel SpMSpV Framework on HPC Platforms

Optimization of Reactive Force Field Simulation: Refactor, Parallelization, and Vectorization for Interactions

OpenACC + Athread collaborative optimization of Silicon-Crystal application on Sunway TaihuLight

The Exascale Era is Upon Us: The Frontier supercomputer may be the first to reach 1,000,000,000,000,000,000 operations per second

What Factors Affect the Performance of Software after Migration: A Case Study on Sunway TaihuLight Supercomputer

Parallel finite volume simulation of the spherical shell dynamo with pseudo-vacuum magnetic boundary conditions

FMapper: Scalable read mapper based on succinct hash index on SunWay TaihuLight

Design and Optimization of Parallel Algorithm for Kalman Filter on SW26010 Many-Core Processors

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Sunway TaihuLight Research Articles

Articles published on Sunway TaihuLight

Implementation and optimisation of the cdugksFoam solver on the Sunway TaihuLight supercomputer

Massively parallel simulations of multi-stage compressors on Sunway TaihuLight

ESA: An efficient sequence alignment algorithm for biological database search on Sunway TaihuLight

Direct simulation of flow field around SUBOFF in grid-generated turbulence with SWLBM

Bio-ESMD: A Data Centric Implementation for Large-Scale Biological System Simulation on Sunway TaihuLight Supercomputer

Large-Scale Simulation of Full Three-Dimensional Flow and Combustion of an Aero-Turbofan Engine on Sunway TaihuLight Supercomputer

AGCM-3DLF: Accelerating Atmospheric General Circulation Model via 3-D Parallelization and Leap-Format

SwPHoToNs: Toward trillion‐body‐scale cosmologicalN‐body simulations on SunwayTaihuLightsupercomputer

End-to-end I/O Monitoring on Leading Supercomputers

Redesigning and Optimizing UCSF DOCK3.7 on Sunway TaihuLight

SwSpAMM: optimizing large-scale sparse approximate matrix multiplication on Sunway Taihulight

Benchmarking 50-Photon Gaussian Boson Sampling on the Sunway TaihuLight

FgSpMSpV: A Fine-grained Parallel SpMSpV Framework on HPC Platforms

Optimization of Reactive Force Field Simulation: Refactor, Parallelization, and Vectorization for Interactions

OpenACC + Athread collaborative optimization of Silicon-Crystal application on Sunway TaihuLight

The Exascale Era is Upon Us: The Frontier supercomputer may be the first to reach 1,000,000,000,000,000,000 operations per second

What Factors Affect the Performance of Software after Migration: A Case Study on Sunway TaihuLight Supercomputer

Parallel finite volume simulation of the spherical shell dynamo with pseudo-vacuum magnetic boundary conditions

FMapper: Scalable read mapper based on succinct hash index on SunWay TaihuLight

Design and Optimization of Parallel Algorithm for Kalman Filter on SW26010 Many-Core Processors