Compression and approximate matching

L Allison

doi:10.1093/comjnl/42.1.1

Abstract

A population of sequences is called non-random if there is a statistical model and an associated compression algorithm that allows members of the population to be compressed, on average. Any available statistical model of population should be incorporated into algorithms for alignment of the sequences and doing so changes the rank order of possible alignments in general. The model should also be used in deciding if a resulting approximate match between two sequences is significant or not. It is shown how to do this for two plausible interpretations involving pairs of sequences that might or might not be related. Efficient alignment algorithms are described for quite general statistical models of sequences. The new alignment algorithms are more sensitive to what might be termed 'features' of the sequences. A natural significance test is shown to be rarely fooled by apparent similarities between two sequences that are merely typical of all or most members of the population, even unrelated members.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Compression and approximate matching

Abstract

Talk to us

Similar Papers

More From: The Computer Journal

Lead the way for us

Journal: The Computer Journal	Publication Date: Jan 1, 1999
Citations: 39

Similar Papers

Algorithms for Sequence Alignment

-

26 Sep 2017
26 Sep 2017

Lifting Prediction to Alignment of RNA Pseudoknots
Mathias Möhl ... Sebastian Will
-
Mathias Möhl, et. al.Mathias Möhl ... Sebastian Will
01 Jan 2009
01 Jan 2009

Lifting Prediction to Alignment of RNA Pseudoknots
Mathias Möhl ... Sebastian Will
Journal of Computational Biology | VOL. 17
Mathias Möhl, et. al.Mathias Möhl ... Sebastian Will
01 Mar 2010
Journal of Computational Biology | VOL. 17

In silico trials for treatment of acute ischemic stroke: Design and implementation
Claire Miller ... Alfons G Hoekstra
Computers in Biology and Medicine | VOL. 137
Claire Miller, et. al.Claire Miller ... Alfons G Hoekstra
26 Aug 2021
Computers in Biology and Medicine | VOL. 137

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Compression and approximate matching

Abstract

Talk to us

Similar Papers

More From: The Computer Journal