The string matching problem on a node-labeled graph G=(V,E) asks whether a given pattern string P equals the concatenation of node labels of some path in G. This is a basic primitive in various problems in bioinformatics, graph databases, or networks, and it was recently proven that solving it in time O(|E|1−ϵ|P|) or O(|E||P|1−ϵ) is hard under OVH, the Orthogonal Vectors Hypothesis, and thus under SETH, the Strong Exponential Time Hypothesis (Equi et al., 2019 [11]). We consider its indexed version, where the graph is indexed to support string queries. We show that, under OVH, no polynomial-time index of the graph performed in time O(|E|α) can support querying P in time O(|P|+|E|δ|P|β), with either δ<1 or β<1.We present our techniques as a general framework, introducing the notion of linear independent-components (lic) reduction, from which we derive our result. This allows us to also translate the quadratic conditional lower bound of Backurs and Indyk (2015) [48] for the problem of matching a query string inside a text, under edit distance, into an analogous tight quadratic lower bound for its indexed version. This improves the recent result of Cohen-Addad, Feuilloley and Starikovskaya (2019) [49], with a slightly different boundary condition. We also apply our technique to obtain the first quadratic indexing lower bounds for Fréchet distance and rooted unlabeled subtree-isomorphism queries.
Read full abstract