Abstract

to exponential growth in the size of genomic databases, traditional techniques of sequence search proved to be slow. To address the above problem, an open source and parallel version of BLAST called mpiBLAST was developed by the programmers. In mpiBLAST, the master process distributes the database fragments among worker nodes to compute the sequence search in parallel. As merging and writing of the results is done sequentially by the master process, it would create performance bottleneck with increasing number of processors and varying database sizes. To handle this high non-search overhead, mpiBLAST-PIO was introduced. This paper describes the optimized and extended version of mpiBLAST called mpiBLAST-PIO. The goal of this research was to investigate the performance of parallel implementation of BLAST in comparison to sequential NCBI-BLAST by measuring Speedup and efficiency on HPC platform using Infiniband. Different options of mpiBLAST-PIO were activated that helped in understanding the optimal parameters for achieving highly scalable parallel BLAST implementation. The results found that parallel-writing of the results, can evolve as an efficient solution when high-performance parallel file system is available.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call