Abstract
Dynamic programming is a classical algorithmic paradigm, which often allows the evaluation of a search space of exponential size in polynomial time. Recursive problem decomposition, tabulation of intermediate results for re-use, and Bellman’s Principle of Optimality are its well-understood ingredients. However, algorithms often lack abstraction and are difficult to implement, tedious to debug, and delicate to modify. The present article proposes a generic framework for specifying dynamic programming problems. This framework can handle all kinds of sequential inputs, as well as tree-structured data. Biosequence analysis, document processing, molecular structure analysis, comparison of objects assembled in a hierarchic fashion, and generally, all domains come under consideration where strings and ordered, rooted trees serve as natural data representations. The new approach introduces inverse coupled rewrite systems. They describe the solutions of combinatorial optimization problems as the inverse image of a term rewrite relation that reduces problem solutions to problem inputs. This specification leads to concise yet translucent specifications of dynamic programming algorithms. Their actual implementation may be challenging, but eventually, as we hope, it can be produced automatically. The present article demonstrates the scope of this new approach by describing a diverse set of dynamic programming problems which arise in the domain of computational biology, with examples in biosequence and molecular structure analysis.
Highlights
Mapping from concrete to abstract is always the easier way
As for the problem of structural matching, with Inverse Coupled Rewrite Systems (ICOREs) S2SGENERIC in Subsection 6.6, we have seen that fixing the first input to a target structure, disregarding base pair insertion, and internalization leads to ICORE COVARIANCEMODEL R, which exhibits the architecture of covariance models
The ICORE of the devil’s advocate would be illegal. (It is interesting to note the similarity of this argument to the discussion of the “yield parsing paradox” in [13], where Bellman’s Principle comes in to explain why we cannot solve all problems in classical Algebraic dynamic programming (ADP) in O(n3), in spite of the Chomsky Normal Form transformation, which only seems to apply to all problems in the classical ADP framework.)
Summary
In the field of biosequence analysis, combinatorial optimization problems on sequences and trees arise in never-ending variety. For determining similarity in genes and proteins, there is the “Needleman-Wunsch” alignment algorithm, refered to as “string edit distance” in the broader field of computer science [2,3] It is used with a variety of scoring schemes that differ in their treatment of matches and mismatches, in their modeling of gaps, and by either minimizing distance or maximizing similarity. While there is much re-use of algorithmic ideas in combinatorial optimization problems on trees and sequences, this is not transparent in the way we represent concrete algorithms Their formulation as dynamic programming algorithms requires us to integrate all problem aspects–construction of the search space, scoring, tabulation of intermediate results, and reporting one or more solutions. It would be advisable to experiment with different approaches, but the high implementation effort prevents this
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.