A common problem in string constraint solvers is computing the Parikh image, a linear arithmetic formula that describes all possible combinations of character counts in strings of a given language. Automata-based string solvers frequently need to compute the Parikh image of products (or intersections) of finite-state automata, in particular when solving string constraints that also include the integer data-type due to operations like string length and indexing. In this context, the computation of Parikh images often turns out to be both prohibitively slow and memory-intensive. This paper contributes a new understanding of how the reasoning about Parikh images can be cast as a constraint solving problem, and questions about Parikh images be answered without explicitly computing the product automaton or the exact Parikh image. The paper shows how this formulation can be efficiently implemented as a calculus, PC*, embedded in an automated theorem prover supporting Presburger logic. The resulting standalone tool Catra is evaluate on constraints produced by the Ostrich+ string solver when solving standard string constraint benchmarks involving integer operations. The experiments show that PC* strictly outperforms the standard approach by Verma et al. to extract Parikh images from finite-state automata, as well as the over-approximating method recently described by Janků and Turoňová by a wide margin, and for realistic timeouts (under 60 s) also the nuXmv model checker. When added as the Parikh image backend of Ostrich+ to the Ostrich string constraint solver’s portfolio, it boosts its results on the quantifier-free strings with linear integer algebra track of SMT-COMP 2023 (QF_SLIA) enough to solve the most Unsat instances in that track of all competitors.
Read full abstract