Many problems in static program analysis can be modeled as the context-free language (CFL) reachability problem on directed labeled graphs. The CFL reachability problem can be generally solved in time O ( n 3 ), where n is the number of vertices in the graph, with some specific cases that can be solved faster. In this work, we ask the following question: given a specific CFL, what is the exact exponent in the monomial of the running time? In other words, for which cases do we have linear, quadratic or cubic algorithms, and are there problems with intermediate runtimes? This question is inspired by recent efforts to classify classic problems in terms of their exact polynomial complexity, known as fine-grained complexity. Although recent efforts have shown some conditional lower bounds (mostly for the class of combinatorial algorithms), a general picture of the fine-grained complexity landscape for CFL reachability is missing. Our main contribution is lower bound results that pinpoint the exact running time of several classes of CFLs or specific CFLs under widely believed lower bound conjectures (e.g., Boolean Matrix Multiplication, k -Clique, APSP, 3SUM). We particularly focus on the family of Dyck- k languages (which are strings with well-matched parentheses), a fundamental class of CFL reachability problems. Remarkably, we are able to show a Ω( n 2.5 ) lower bound for Dyck-2 reachability, which to the best of our knowledge is the first super-quadratic lower bound that applies to all algorithms, and shows that CFL reachability is strictly harder that Boolean Matrix Multiplication. We also present new lower bounds for the case of sparse input graphs where the number of edges m is the input parameter, a common setting in the database literature. For this setting, we show a cubic lower bound for Andersen’s Pointer Analysis which significantly strengthens prior known results.
Read full abstract