Efficient Algorithms for Computing the Inner Edit Distance of a Regular Language via Transducers

Lila Kari,Meng Yang,Stavros Konstantinidis,Steffen Kopecki

doi:10.3390/a11110165

Lila Kari, Meng Yang + Show 2 more

Open Access

https://doi.org/10.3390/a11110165

Copy DOI

Abstract

The concept of edit distance and its variants has applications in many areas such as computational linguistics, bioinformatics, and synchronization error detection in data communications. Here, we revisit the problem of computing the inner edit distance of a regular language given via a Nondeterministic Finite Automaton (NFA). This problem relates to the inherent maximal error-detecting capability of the language in question. We present two efficient algorithms for solving this problem, both of which execute in time O ( r 2 n 2 d ) , where r is the cardinality of the alphabet involved, n is the number of transitions in the given NFA, and d is the computed edit distance. We have implemented one of the two algorithms and present here a set of performance tests. The correctness of the algorithms is based on the connection between word distances and error detection and the fact that nondeterministic transducers can be used to represent the errors (resp., edit operations) involved in error-detection (resp., in word distances).

Highlights

IntroductionThe concept of edit distance and its variants has applications in many areas such as computational linguistics [1], bioinformatics [2], and synchronization error detection in data communications [3]
The concept of edit distance and its variants has applications in many areas such as computational linguistics [1], bioinformatics [2], and synchronization error detection in data communications [3].The edit distance of a language L with at least two words— referred to as inner edit distance of L—is the minimum edit distance between any two different words in L
We present two efficient algorithms to compute the inner edit distance of a regular language given via an Nondeterministic Finite Automaton (NFA) with n transitions—see Theorems 1 and 3

Summary

Introduction

The concept of edit distance and its variants has applications in many areas such as computational linguistics [1], bioinformatics [2], and synchronization error detection in data communications [3]. O(n5 ) for DFAs, and O(n8 ) for NFAs. In this paper, we present two efficient algorithms to compute the inner edit distance of a regular language given via an NFA with n transitions—see Theorems 1 and 3. We present two efficient algorithms to compute the inner edit distance of a regular language given via an NFA with n transitions—see Theorems 1 and 3 Both algorithms, which are called DistErrDetect and DistInpAlter, have the same worst-case time complexity. We only consider the channel sid(k), for some k ∈ N, such that (u, v) ∈ sid(k) if and only if v can be obtained by applying at most k errors in u, where an error could be a deletion of a symbol in u, a substitution of a symbol in u with another symbol, or an insertion of a symbol in u—see further below for a more rigorous definition via edit-strings

NFAs and Transducers

Edit Strings and Edit Distance

Edit Distance via Error-Detection

Let Ba be the edit distance bound in Lemma 2

An Input-Altering Transducer for Edit-Distance

Let Ba be the bound in Lemma 2

Construct the transducer ia1 —see Figure 2

Implementation and Testing

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Efficient Algorithms for Computing the Inner Edit Distance of a Regular Language via Transducers

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithms

Lead the way for us

Journal: Algorithms	Publication Date: Oct 23, 2018
License type: CC BY 4.0

Similar Papers

A contextual normalised edit distance
Colin De La Higuera ... Luisa Mico
-
Colin De La Higuera, et. al.Colin De La Higuera ... Luisa Mico
01 Apr 2008
01 Apr 2008

A Contextual Normalised Edit Distance
Colin De La Higuera ... Luisa Micó
-
Colin De La Higuera, et. al.Colin De La Higuera ... Luisa Micó
01 Apr 2008
01 Apr 2008

Parameterized Mapping Distances for Semi-Structured Data
Kilho Shin ... Taro Niiyama
-
Kilho Shin, et. al.Kilho Shin ... Taro Niiyama
30 Dec 2019
30 Dec 2019

Kendall tau sequence distance: Extending Kendall tau from ranks to sequences
Vincent Cicirello
EAI Endorsed Transactions on Industrial Networks and Intelligent Systems | VOL. 7
Vincent CicirelloVincent Cicirello
18 May 2020
EAI Endorsed Transactions on Industrial Networks and Intelligent Systems | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficient Algorithms for Computing the Inner Edit Distance of a Regular Language via Transducers

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithms