English

Pawar Shrikant,Zhu Ying,Stanam Aditya

doi:10.5897/jbsa2018.0109

Abstract

Bioinformatics is an emerging field, where information technology usage can significantly accelerate life science research. It is a relatively new field and the scope of exploring new tools and techniques seems immense. One major field where bioinformatics plays important role is next generation sequence analysis (NGS), in which an unknown genome is shuttered into pieces and tried to align it to a reference known genome to decipher its functions using sequence comparison. The first well known application of this technology is the human genome project which took nearly 10 years to finish. With advancements in central processing units (CPUs), the alignment time has improved, but has not reached optimal. There seems a constant need to improve this computing time, which made the scope for using graphics processing units (GPUs) and parallel programming tasks to replace CPUs. With access to high performance multi-thread, multi-core parallel computing supercomputers, several GPU based sequence alignment tools have been published recently, some of the major tools are BarraCUDA, CUSHAW, GPU-BWT, SOAP3, and SARUMAN, which claim to speed up the processes anywhere between 2x and 10x times. Most of these tools can be compiled on GCC 4.3 compilers with CUDA. This paper focuses on compiling the current GPU based alignment tools on 70.7 million read pairs (Illumina HiSeq 2000) to align them on a human genome and check its efficiency (time sensitivity and alignment specificity) compared to traditional CPU based alignment (Bowtie) tool. Resulting observations would help researchers choose the appropriate GPU alignment tool to suffice their computing needs. Key words: CUDA, sequencing, alignment, graphics processing units (GPUs), central processing units (CPUs).

Full Text