Abstract

Sequence comparison is one of the most fundamental computational problems in bioinformatics. Pairwise sequence alignment methods align two sequences using a substitution matrix consisting of pairwise scores of aligning different residues with each other like BLOSUM62, and give an alignment score for the given sequence-pair. This work addresses the problem of accurately estimating statistical significance of pairwise alignment for the purpose of identifying related sequences, by making the sequence comparison process more sequence-specific. Specifically, we develop algorithms for sequence-specific strategies for hardware acceleration of pairwise sequence alignment in conjunction with statistical significance estimation. Using pairwise statistical significance has been shown to give better retrieval accuracy compared to database statistical significance reported by popular database search programmes like BLAST and PSI-BLAST. We provide a 'flexible array' hardware architecture which provides a scalable systolic array suitable for both long and short sequences. The results with Xtremedata XD1000 FPGA platform show a speed-up by up to a factor of more than 200.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call