The resurgence of Datalog in the last two decades has led to a multitude of new Datalog systems. These systems explore novel ideas for improving Datalog's programmability and performance, making important contributions to the field. Unfortunately, the individual systems progress at a much slower pace than the overall field, because improvements in one system are rarely ported to other systems. The reason for this rift is that each system provides its own Datalog dialect with specific notation, language features, and invariants, enabling specific optimization and execution strategies. This paper presents the first compiler framework for Datalog that can be used to support any Datalog frontend language and to target any Datalog backend. The centerpiece of our framework is a novel typed multi-level Datalog IR that supports IR extensions and guarantees executability. Existing Datalog systems can provide a compiler frontend that translates their Datalog dialect to the extended IR. The IR is then progressively lowered toward core Datalog, allowing optimizations at each level. At last, compiler backends can target different Datalog solvers. We have implemented the compiler framework and integrated 4 Datalog frontends and 3 Datalog backends, using 16 IR extensions. We also formalize the IR's flexible type system, which is bidirectional, flow-sensitive, bipolar, and uses three-valued typing contexts. The type system simultaneously validates type compatibility and precisely tracks bindings of logic variables while permitting IR extensions.
Read full abstract