High Performance Biological Pairwise Sequence Alignment: FPGA versus GPU versus Cell BE versus GPP

Khaled Benkrid,Cheng Ling,Xiang Tian,Yang Song,Ying Liu,Ali Akoglu

doi:10.1155/2012/752910

Abstract

This paper explores the pros and cons of reconfigurable computing in the form of FPGAs for high performance efficient computing. In particular, the paper presents the results of a comparative study between three different acceleration technologies, namely, Field Programmable Gate Arrays (FPGAs), Graphics Processor Units (GPUs), and IBM’s Cell Broadband Engine (Cell BE), in the design and implementation of the widely-used Smith-Waterman pairwise sequence alignment algorithm, with general purpose processors as a base reference implementation. Comparison criteria include speed, energy consumption, and purchase and development costs. The study shows that FPGAs largely outperform all other implementation platforms on performance per watt criterion and perform better than all other platforms on performance per dollar criterion, although by a much smaller margin. Cell BE and GPU come second and third, respectively, on both performance per watt and performance per dollar criteria. In general, in order to outperform other technologies on performance per dollar criterion (using currently available hardware and development tools), FPGAs need to achieve at least two orders of magnitude speed-up compared to general-purpose processors and one order of magnitude speed-up compared to domain-specific technologies such as GPUs.

Highlights

Since it was first announced in 1965, Moore’s law has stood up the test of time, providing exponential increases in computing power for science and engineering problems over time
In order to keep Moore’s law going, general-purpose processor manufacturers, for example, Intel and AMD, are relying on multicore chip technology in which multiple cores run simultaneously on the same chip at capped clock frequencies to limit power consumption. While this has the potential to provide considerable speed-up for science and engineering applications, it is creating a semantic gap between applications, traditionally written in sequential code, and hardware, as multicore technologies need to be programmed in parallel to take advantage of their performance potential. This problem is opening a window of opportunity for hitherto niche parallel computer technologies such as Field Programmable Gate Arrays (FPGAs) and Graphics Processor Units (GPUs) since the problem of parallel programming has to be tackled for general-purpose processors anyway
This paper presents a comparative study between three different acceleration technologies, namely, Field Programmable Gate Arrays (FPGAs), Graphics Processor Units (GPUs), and IBM’s Cell Broadband Engine (Cell BE), in the design and implementation of the widely-used SmithWaterman pairwise sequence alignment algorithm, with general purpose processors as a base reference implementation

Summary

Introduction

Since it was first announced in 1965, Moore’s law has stood up the test of time, providing exponential increases in computing power for science and engineering problems over time. In order to keep Moore’s law going, general-purpose processor manufacturers, for example, Intel and AMD, are relying on multicore chip technology in which multiple cores run simultaneously on the same chip at capped clock frequencies to limit power consumption. While this has the potential to provide considerable speed-up for science and engineering applications, it is creating a semantic gap between applications, traditionally written in sequential code, and hardware, as multicore technologies need to be programmed in parallel to take advantage of their performance potential. Given that a query sequence is often aligned to a whole database of sequences in order to find the closest matching sequence (see Figure 1) and given the annual increase in the size of biological databases, there is a need for a matching increase in computing power at reasonable cost [5]

Objectives

Results

Conclusion