Abstract
An indel refers to a single insertion or deletion, while an edit refers to a single insertion, deletion or substitution. In this article, we investigate codes that correct either a single indel or a single edit and provide linear-time algorithms that encode binary messages into these codes of length n. Over the quaternary alphabet, we provide two linear-time encoders. One corrects a single edit with ⌈log n⌉+ O(loglog n) redundancy bits, while the other corrects a single indel with ⌈log n⌉+2 redundant bits. These two encoders are order-optimal. The former encoder is the first known order-optimal encoder that corrects a single edit, while the latter encoder (that corrects a single indel) reduces the redundancy of the best known encoder of Tenengolts (1984) by at least four bits. Over the DNA alphabet, we impose an additional constraint: the GC-balanced constraint and require that exactly half of the symbols of any DNA codeword to be either C or G. In particular, via a modification of Knuth's balancing technique, we provide a linear-time map that translates binary messages into GC-balanced codewords and the resulting codebook is able to correct a single indel or a single edit. These are the first known constructions of GC-balanced codes that correct a single indel or a single edit.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have