Compositional Properties of Alignments

Sarah J Berkemer,Peter F Stadler,Christian Höner Zu Siederdissen

doi:10.1007/s11786-020-00496-8

Abstract

Alignments, i.e., position-wise comparisons of two or more strings or ordered lists are of utmost practical importance in computational biology and a host of other fields, including historical linguistics and emerging areas of research in the Digital Humanities. The problem is well-known to be computationally hard as soon as the number of input strings is not bounded. Due to its practical importance, a huge number of heuristics have been devised, which have proved very successful in a wide range of applications. Alignments nevertheless have received hardly any attention as formal, mathematical structures. Here, we focus on the compositional aspects of alignments, which underlie most algorithmic approaches to computing alignments. We also show that the concepts naturally generalize to finite partially ordered sets and partial maps between them that in some sense preserve the partial orders. As a consequence of this discussion we observe that alignments of even more general structure, in particular graphs, are essentially characterized by the fact that the restriction of alignments to a row must coincide with the corresponding input graphs. Pairwise alignments of graphs are therefore determined completely by common induced subgraphs. In this setting alignments of alignments are well-defined, and alignments can be decomposed recursively into subalignments. This provides a general framework within which different classes of alignment algorithms can be explored for objects very different from sequences and other totally ordered data structures.

Highlights

Alignments play an important role in particular in bioinformatics as a means of comparing two or more strings by explicitly identifying correspondences between letters as well as insertions and deletions [13]
Most commonly a scoring model is defined for pairs of sequences and generalized to multiple alignments as sums over certain pairwise alignments that are obtained as projections
In this contribution we have analyzed the compositional properties of sequence alignments and explored the generalization to much more general structures

Summary

Introduction

Alignments play an important role in particular in bioinformatics as a means of comparing two or more strings by explicitly identifying correspondences between letters (usually called matches and mismatches) as well as insertions and deletions [13]. The pairwise scoring is usually specified either in terms of matches or in terms of edit operations (insertions, deletions, or substitutions). In this contribution, we will almost completely disregard the scoring of alignments and instead focus on the structure of (multiple) alignments as combinatorial objects. Following a brief discussion of the view of alignments as compositions of pairwise matching relations, we further generalize the formalism to include first ordered trees, directed and undirected graphs, and essentially arbitrary finite spaces that admit well-behaved subspace constructions. We shall conclude that alignments are alternatively specified in terms by common induced subgraphs (or the corresponding common induced subspaces in full generality)

A Very Brief Review of Sequence Alignments

Formal Definitions of Sequence Alignments

Alignments of Partially Ordered Sets

Composition of Alignments

Blockwise Decompositions

Recursive Construction

Alignments as Relations

Tree Alignments

10 Alignments of Graphs

11 Alignments for General Structures

12 Concluding Remarks

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Mathematics in Computer Science	Publication Date: Dec 28, 2020
Citations: 4	License type: open-access

R Discovery Prime

R Discovery Prime

Compositional Properties of Alignments

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematics in Computer Science

Lead the way for us

Similar Papers

Restricted and Swap Common Superstring: A Multivariate Algorithmic Perspective
Paola Bonizzoni ... Riccardo Dondi
Algorithmica | VOL. 72
Paola Bonizzoni, et. al.Paola Bonizzoni ... Riccardo Dondi
12 Apr 2014
Algorithmica | VOL. 72

Qualitative research in the digital humanities
Reinoud Bosch
KWALON | VOL. 21
Reinoud BoschReinoud Bosch
01 Mar 2016
KWALON | VOL. 21

Fast Exact Algorithms for the Closest String and Substring Problems with Application to the Planted (L,d)-Motif Model
Zhi-Zhong Chen ... Lusheng Wang
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 8
Zhi-Zhong Chen, et. al. Zhi-Zhong Chen ... Lusheng Wang
01 Sep 2011
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 8

Longest common subsequence problem for unoriented and cyclic strings
François Nicolas ... Eric Rivals
Theoretical Computer Science | VOL. 370
François Nicolas, et. al.François Nicolas ... Eric Rivals
17 Oct 2006
Theoretical Computer Science | VOL. 370

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Compositional Properties of Alignments

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematics in Computer Science