Abstract

Most known methods for the determination of the structure of macromolecular complexes are limited or at least restricted at some point by their computational demands. Recent developments in information technology such as multicore, parallel and GPU processing can be used to overcome these limitations. In particular, graphics processing units (GPUs), which were originally developed for rendering real-time effects in computer games, are now ubiquitous and provide unprecedented computational power for scientific applications. Each parallel-processing paradigm alone can improve overall performance; the increased computational performance obtained by combining all paradigms, unleashing the full power of today's technology, makes certain applications feasible that were previously virtually impossible. In this article, state-of-the-art paradigms are introduced, the tools and infrastructure needed to apply these paradigms are presented and a state-of-the-art infrastructure and solution strategy for moving scientific applications to the next generation of computer hardware is outlined.

Highlights

  • In contrast to even the most sophisticated central processing unit (CPU) threading models, compute unified device architecture (CUDA) threads are extremely lightweight. This property means that the overhead for the creation of threads and context switching between threads is reduced to virtually one graphics processing units (GPUs) clock cycle, in comparison to several hundred CPU clock cycles for CPU threads such as boost threads or p-threads (Nickolls et al, 2008)

  • For GPU computing, we focused on NVIDIA as a hardware manufacturer and NVIDIA CUDA as a programming language

  • Commodity graphics hardware is evolving to become a new generation of massively parallel streaming processors for general-purpose computing

Read more

Summary

Need for speed

Owing to the ever-increasing speed of data collection in science, computational performance plays a central role in various disciplines in biology and physics. Computational demands in the field of structural biology are especially high for highresolution structure determination by single-particle electron cryomicroscopy because an ever-larger number of images are currently being used to overcome the resolution limits of this technique. There is certainly some linear computational speed increase in central processing unit (CPU) technology that can be expected in the future. Most of today’s speed increase is already based on multi-core CPU architecture. Certain applications, such as the alignment of large numbers of single-particle cryo-EM images, will require significantly more computational power than the current improvement in CPU technology can offer. In some areas future applications will only be possible if the computational power can be increased by at least two orders of magnitude. An increase in computational power is essential to keep up with modern scientific technologies

How can computational speed be improved?
Divide et impera
Hardware architectures and their implications for parallel processing
If you cannot have a faster CPU then use more of them
Software standards for sharedmemory and distributed computing
Software standards for GPU programming
Historical limitations alleged by Amdahl’s law
Transition to parallel computing
Data-parallel programming
The SmartTray
Getting started with data-parallel processing
Discussion and outlook
17. Washington
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call