Hirschberg's Algorithm Research Articles

SummaryFinding the longest common subsequence between two strings in acceptable time frames is crucial to solving various problems in different fields of study. To ensure the optimal solution is found, algorithms based on dynamic programming are employed almost exclusively. While the most commonly adopted algorithm, proposed by Needleman and Wunsch, has quadratic time and space complexity, the linear space complexity of Hirschberg's algorithm favors the comparisons of longer sequences. However, it too has a quadratic time complexity and therefore the effective exploitation of parallelism has become essential. This paper focuses on improving the execution efficiency of Hirschberg's algorithm on multi‐core and many‐core systems. To achieve this goal, first, enhancements to the sequential version are proposed to take advantage of SIMD instructions available on modern processors. Second, the impact on the performance of different parallelization strategies is investigated and evaluated. Results show that combining these two aspects can greatly improve the performance of Hirschberg's algorithm on these architectures. In relation to the original version, speedups of over 46 were achieved on a dual 18‐core server for sequences of 1.6 million characters. Furthermore, experiments with a 68‐core Intel Xeon Phi (many‐core) system obtained speedups of up to 105 for the same sequence size.

Read full abstract

Multiple sequence alignment (MSA) is a ubiquitous problem in computational biology. Although it is NP-hard to find an optimal solution for an arbitrary number of sequences, due to the importance of this problem researchers are trying to push the limits of exact algorithms further. Since MSA can be cast as a classical path finding problem, it is attracting a growing number of AI researchers interested in heuristic search algorithms as a challenge with actual practical relevance. In this paper, we first review two previous, complementary lines of research. Based on Hirschberg's algorithm, Dynamic Programming needs O(kN^(k-1)) space to store both the search frontier and the nodes needed to reconstruct the solution path, for k sequences of length N. Best first search, on the other hand, has the advantage of bounding the search space that has to be explored using a heuristic. However, it is necessary to maintain all explored nodes up to the final solution in order to prevent the search from re-expanding them at higher cost. Earlier approaches to reduce the Closed list are either incompatible with pruning methods for the Open list, or must retain at least the boundary of the Closed list. In this article, we present an algorithm that attempts at combining the respective advantages; like A* it uses a heuristic for pruning the search space, but reduces both the maximum Open and Closed size to O(kN^(k-1)), as in Dynamic Programming. The underlying idea is to conduct a series of searches with successively increasing upper bounds, but using the DP ordering as the key for the Open priority queue. With a suitable choice of thresholds, in practice, a running time below four times that of A* can be expected. In our experiments we show that our algorithm outperforms one of the currently most successful algorithms for optimal multiple sequence alignments, Partial Expansion A*, both in time and memory. Moreover, we apply a refined heuristic based on optimal alignments not only of pairs of sequences, but of larger subsets. This idea is not new; however, to make it practically relevant we show that it is equally important to bound the heuristic computation appropriately, or the overhead can obliterate any possible gain. Furthermore, we discuss a number of improvements in time and space efficiency with regard to practical implementations. Our algorithm, used in conjunction with higher-dimensional heuristics, is able to calculate for the first time the optimal alignment for almost all of the problems in Reference 1 of the benchmark database BAliBASE.

Read full abstract

Hirschberg's Algorithm Research Articles

Related Topics

Articles published on Hirschberg's Algorithm

AFMC: An alignment framework for multiple computing services and providers

LongAGE: defining breakpoints of genomic structural variants through optimal and memory efficient alignments of long reads.

Foreword to the special issue of the workshop on high performance computing systems (XVIII Simpósio em Sistemas Computacionais de Alto Desempenho, WSCAD 2017)

On the parallelization of Hirschberg's algorithm for multi‐core and many‐core systems

A Combined Two Step Approach for Detecting Input Validation Attacks Against Web Applications

Mitigation of Web Based attacks using Mobile Agents in client side

HMMConverter 1.0: a toolbox for hidden Markov models

FastLSA: A Fast, Linear-Space, Parallel and Sequential Algorithm for Sequence Alignment

An Improved Search Algorithm for Optimal Multiple-Sequence Alignment

HIRSCHBERG’S ALGORITHM FOR APPROXIMATE MATCHING

Using Hirschberg's algorithm to generate random alignments of strings

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Hirschberg's Algorithm Research Articles

Related Topics

Articles published on Hirschberg's Algorithm

AFMC: An alignment framework for multiple computing services and providers

LongAGE: defining breakpoints of genomic structural variants through optimal and memory efficient alignments of long reads.

Foreword to the special issue of the workshop on high performance computing systems (XVIII Simpósio em Sistemas Computacionais de Alto Desempenho, WSCAD 2017)

On the parallelization of Hirschberg's algorithm for multi‐core and many‐core systems

A Combined Two Step Approach for Detecting Input Validation Attacks Against Web Applications

Mitigation of Web Based attacks using Mobile Agents in client side

HMMConverter 1.0: a toolbox for hidden Markov models

FastLSA: A Fast, Linear-Space, Parallel and Sequential Algorithm for Sequence Alignment

An Improved Search Algorithm for Optimal Multiple-Sequence Alignment

HIRSCHBERG’S ALGORITHM FOR APPROXIMATE MATCHING

Using Hirschberg's algorithm to generate random alignments of strings