Common Subsequence Problem Research Articles

The Longest Common Subsequence (LCS) is the problem of finding a subsequence among a set of strings that has two properties of being common to all and the longest. The LCS has applications in computational biology and text editing, among many others. Due to the NP-hardness of the general longest common subsequence, numerous heuristic algorithms and solvers have been proposed to give the best possible solution for different sets of strings. None of them has the best performance for all types of sets. In addition, there is no method to specify the type of a given set of strings. Besides that, the available hyper-heuristic is not efficient and fast enough to solve this problem in real-world applications. This paper proposes a novel hyper-heuristic to solve the longest common subsequence problem using a new criterion to classify a set of strings based on their similarity. To do this, we offer a general stochastic framework to identify the type of a given set of strings. Following that, we introduce the set similarity dichotomizer (S2D) algorithm based on the framework that divides the type of sets into two. This algorithm is introduced for the first time in this paper and opens a new way to go beyond the current LCS solvers. Then, we present our proposed hyper-heuristic that exploits the S2D and one of the internal properties of the given strings to choose the best matching heuristic among a set of heuristics. We compare the results on benchmark datasets with the best heuristics and hyper-heuristics. The results show that our proposed dichotomizer (i.e., S2D) can classify datasets with 98% of accuracy. Also, our proposed hyper-heuristic obtains competitive performance in comparison with the best methods and outperforms best hyper-heuristics for uncorrelated datasets in terms of both quality of solutions and run time factors. All supplementary files, including the source codes and datasets, are publicly available on GitHub.11https://github.com/BioinformaticsIASBS/LCS-DSclassification.

Read full abstract

The development of a satisfying and rigorous mathematical understanding of the performance of neural networks is a major challenge in artificial intelligence. Against this background, we study the expressive power of neural networks through the example of the classical NP-hard knapsack problem. Our main contribution is a class of recurrent neural networks (RNNs) with rectified linear units that are iteratively applied to each item of a knapsack instance and thereby compute optimal or provably good solution values. We show that an RNN of depth four and width depending quadratically on the profit of an optimum knapsack solution is sufficient to find optimum knapsack solutions. We also prove the following tradeoff between the size of an RNN and the quality of the computed knapsack solution: for knapsack instances consisting of n items, an RNN of depth five and width w computes a solution of value at least [Formula: see text] times the optimum solution value. Our results build on a classical dynamic programming formulation of the knapsack problem and a careful rounding of profit values that are also at the core of the well-known fully polynomial-time approximation scheme for the knapsack problem. A carefully conducted computational study qualitatively supports our theoretical size bounds. Finally, we point out that our results can be generalized to many other combinatorial optimization problems that admit dynamic programming solution methods, such as various shortest path problems, the longest common subsequence problem, and the traveling salesperson problem. History: Andrea Lodi, Area Editor for Design & Analysis of Algorithms–Discrete. An extended abstract of this article, including Figures 1 – 7 , appeared in the Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, 7685–7693 ( Hertrich and Skutella 2021 ); see https://ojs.aaai.org/index.php/AAAI/article/view/16939 ; copyright © 2021, Association for the Advancement of Artificial Intelligence. Funding: This work was supported by the Deutsche Forschungsgemeinschaft [Grants DFG-GRK 2434 and EXC-2046/1, Project 390685689] and the H2020 European Research Council [ScaleOpt-757481].

Read full abstract

Common Subsequence Problem Research Articles

Related Topics

Articles published on Common Subsequence Problem

Dominant point-based sequential and parallel algorithms for the multiple sequential substring constrained-LCS problem

Dynamic-MLCS: Fast searching for dynamic multiple longest common subsequences in sequence stream data

Reconstruction algorithms for DNA-storage systems

Move schedules: fast persistence computations in coarse dynamic settings

Longest common substring in Longest Common Subsequence’s solution service: A novel hyper-heuristic

Linear-space S-table algorithms for the longest common subsequence problem

Provably Good Solutions to the Knapsack Problem via Neural Networks of Bounded Size

A two-step methodology for product platform design and assessment in high-variety manufacturing

An Algorithm for the Longest Common Subsequence and Substring Problem

Time-series anomaly detection using dynamic programming based longest common subsequence on sensor data

An efficient algorithm for the longest common palindromic subsequence problem

A coarse-grained multicomputer parallel algorithm for the sequential substring constrained longest common subsequence problem

Graph search and variable neighborhood search for finding constrained longest common subsequences in artificial and real gene sequences

The Algorithms for the Linear-Space S-Table on the Longest Common Subsequence Problem

Solving the Longest Common Subsequence Problem Concerning Non-Uniform Distributions of Letters in Input Strings

Provably Good Solutions to the Knapsack Problem via Neural Networks of Bounded Size

New Construction of Family of MLCS Algorithms.

A Branch Elimination-based Efficient Algorithm for Large-scale Multiple Longest Common Subsequence Problem

An A⁎ search algorithm for the constrained longest common subsequence problem

Solving longest common subsequence problems via a transformation to the maximum clique problem

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Common Subsequence Problem Research Articles

Related Topics

Articles published on Common Subsequence Problem

Dominant point-based sequential and parallel algorithms for the multiple sequential substring constrained-LCS problem

Dynamic-MLCS: Fast searching for dynamic multiple longest common subsequences in sequence stream data

Reconstruction algorithms for DNA-storage systems

Move schedules: fast persistence computations in coarse dynamic settings

Longest common substring in Longest Common Subsequence’s solution service: A novel hyper-heuristic

Linear-space S-table algorithms for the longest common subsequence problem

Provably Good Solutions to the Knapsack Problem via Neural Networks of Bounded Size

A two-step methodology for product platform design and assessment in high-variety manufacturing

An Algorithm for the Longest Common Subsequence and Substring Problem

Time-series anomaly detection using dynamic programming based longest common subsequence on sensor data

An efficient algorithm for the longest common palindromic subsequence problem

A coarse-grained multicomputer parallel algorithm for the sequential substring constrained longest common subsequence problem

Graph search and variable neighborhood search for finding constrained longest common subsequences in artificial and real gene sequences

The Algorithms for the Linear-Space S-Table on the Longest Common Subsequence Problem

Solving the Longest Common Subsequence Problem Concerning Non-Uniform Distributions of Letters in Input Strings

Provably Good Solutions to the Knapsack Problem via Neural Networks of Bounded Size

New Construction of Family of MLCS Algorithms.

A Branch Elimination-based Efficient Algorithm for Large-scale Multiple Longest Common Subsequence Problem

An A⁎ search algorithm for the constrained longest common subsequence problem

Solving longest common subsequence problems via a transformation to the maximum clique problem