Abstract

Previous chapter Next chapter Full AccessProceedings Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms (SODA)Regular Languages meet Prefix SortingJarno Alanko, Giovanna D'Agostino, Alberto Policriti, and Nicola PrezzaJarno Alanko, Giovanna D'Agostino, Alberto Policriti, and Nicola Prezzapp.911 - 930Chapter DOI:https://doi.org/10.1137/1.9781611975994.55PDFBibTexSections ToolsAdd to favoritesExport CitationTrack CitationsEmail SectionsAboutAbstract Indexing strings via prefix (or suffix) sorting is, arguably, one of the most successful algorithmic techniques developed in the last decades. Can indexing be extended to languages? The main contribution of this paper is to initiate the study of the sub-class of regular languages accepted by an automaton whose states can be prefix-sorted. Starting from the recent notion of Wheeler graph [Gagie et al., TCS 2017]— which extends naturally the concept of prefix sorting to labeled graphs—we investigate the properties of Wheeler languages, that is, regular languages admitting an accepting Wheeler finite automaton. We first characterize this family as the natural extension of regular languages endowed with the co-lexicographic ordering: the sorted prefixes of strings belonging to a Wheeler language are partitioned into a finite number of co-lexicographic intervals, each formed by elements from a single Myhill-Nerode equivalence class. We proceed by proving several results related to Wheeler automata: (i) We show that every Wheeler NFA (WNFA) with n states admits an equivalent Wheeler DFA (WDFA) with at most 2n – 1 – |Σ| states (Σ being the alphabet) that can be computed in O(n3) time. (ii) We describe a quadratic algorithm to prefix-sort a proper superset of the WDFAs, a O(n log n)-time online algorithm to sort acyclic WDFAs, and an optimal linear-time offline algorithm to sort general WDFAs. (iii) We provide a minimization theorem that characterizes the smallest WDFA recognizing the same language of any input WDFA. The corresponding constructive algorithm runs in optimal linear time in the acyclic case, and in O(n log n) time in the general case. (iv) We show how to compute the smallest WDFA equivalent to any acyclic DFA in nearly-optimal time. Our contributions imply new results of independent interest. Contributions (i-iii) provide a new class of NFAs for which the minimization problem can be approximated within a constant factor in polynomial time. Contribution (iv) provides a provably minimum-size solution for the well-studied problem of indexing deterministicacyclic graphs for linear-time pattern matching queries. Previous chapter Next chapter RelatedDetails Published:2020eISBN:978-1-61197-599-4 https://doi.org/10.1137/1.9781611975994Book Series Name:ProceedingsBook Code:PRDA20Book Pages:xxii + 3011

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.