On the Complexity of the Smallest Grammar Problem over Fixed Alphabets

Katrin Casel,Henning Fernau,Benjamin Gras,Serge Gaspers,Markus L Schmid

doi:10.1007/s00224-020-10013-w

Abstract

In the smallest grammar problem, we are given a word w and we want to compute a preferably small context-free grammar G for the singleton language {w} (where the size of a grammar is the sum of the sizes of its rules, and the size of a rule is measured by the length of its right side). It is known that, for unbounded alphabets, the decision variant of this problem is NP-hard and the optimisation variant does not allow a polynomial-time approximation scheme, unless P = NP. We settle the long-standing open problem whether these hardness results also hold for the more realistic case of a constant-size alphabet. More precisely, it is shown that the smallest grammar problem remains NP-complete (and its optimisation version is APX-hard), even if the alphabet is fixed and has size of at least 17. The corresponding reduction is robust in the sense that it also works for an alternative size-measure of grammars that is commonly used in the literature (i. e., a size measure also taking the number of rules into account), and it also allows to conclude that even computing the number of rules required by a smallest grammar is a hard problem. On the other hand, if the number of nonterminals (or, equivalently, the number of rules) is bounded by a constant, then the smallest grammar problem can be solved in polynomial time, which is shown by encoding it as a problem on graphs with interval structure. However, treating the number of rules as a parameter (in terms of parameterised complexity) yields W[1]-hardness. Furthermore, we present an mathcal {O}(3^{mid {w}mid }) exact exponential-time algorithm, based on dynamic programming. These three main questions are also investigated for 1-level grammars, i. e., grammars for which only the start rule contains nonterminals on the right side; thus, investigating the impact of the “hierarchical depth” of grammars on the complexity of the smallest grammar problem. In this regard, we obtain for 1-level grammars similar, but slightly stronger results.

Highlights

Context-free grammars are among the most classical concepts in theoretical computer science
From a formal languages point of view, describing a single word by a context-free grammar seems excessive, there are at least two evident motivations: – Compression Perspective:2 The grammar G is a compressed representation of the word w. – Inference Perspective: The grammar G identifies the hierarchical structure of the word w
The inference perspective of computing grammars for single words has been applied in two more PhD-theses, namely by de Marcken [3] in order to investigate whether analysing the structure of small grammars for large English texts could help understanding the structure of the language itself, and by Galle [4] in order to infer hierarchical structures in DNA

Summary

Introduction

Context-free grammars are among the most classical concepts in theoretical computer science. We are concerned with grammars G that describe singleton languages {w} (or, by slightly abusing notation, grammars describing single words).

Grammars as Inference Tools and Compressors

Algorithmics on Compressed Strings

The Smallest Grammar Problem

Our Contribution

Outline of the Paper

Preliminaries

Basic Concepts of Graph Theory and Complexity Theory

Grammars

Examples

NP-Hardness of Computing Smallest Grammars for Fixed Alphabets

The 1-Level Case

The Multi-Level Case

Extensions of the Reductions

Smallest Grammars with a Bounded Number of Nonterminals

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Theory of Computing Systems	Publication Date: Nov 13, 2020
Citations: 7	License type: open-access

R Discovery Prime

R Discovery Prime

On the Complexity of the Smallest Grammar Problem over Fixed Alphabets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Theory of Computing Systems

Lead the way for us

Similar Papers

On the Complexity of Grammar-Based Compression over Fixed Alphabets
...
-
, et. al. ...
01 Jan 2015
01 Jan 2015

The Approximability of Maximum Rooted Triplets Consistency with Fan Triplets and Forbidden Triplets
Jesper Jansson ... Eva-Marta Lundell
-
Jesper Jansson, et. al.Jesper Jansson ... Eva-Marta Lundell
01 Jan 2015
01 Jan 2015

Minimizing the Weighted Number of Late Jobs with Batch Setup Times and Delivery Costs on a Single Machine
George Steiner ... Rui Zhang
-
George Steiner, et. al.George Steiner ... Rui Zhang
01 Dec 2007
01 Dec 2007

An FPTAS for SM‐CELS problem with monotone cost functions
Jianteng Xu ... Mian‐Yun Chen
Kybernetes | VOL. 38
Jianteng Xu, et. al.Jianteng Xu ... Mian‐Yun Chen
16 Oct 2009
Kybernetes | VOL. 38

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On the Complexity of the Smallest Grammar Problem over Fixed Alphabets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Theory of Computing Systems