Abstract

This paper describes a compiler optimization to eliminates dynamic occurrences of expressions in the format a ← a ⊕ b ⊗ c . The operation ⊕ must admit an identity element z , such that a ⊕ z = a . Also, z must be the absorbing element of ⊗, such that b ⊗ z = z ⊗ c = z . Semirings where ⊕ is the additive operator and ⊗ is the multiplicative operator meet this contract. This pattern is common in high-performance benchmarks—its canonical representative being the multiply-add operation a ← a + b × c . However, several other expressions involving arithmetic and logic operations satisfy the required algebra. We show that the runtime elimination of such assignments can be implemented in a performance-safe way via online profiling. The elimination of dynamic redundancies involving identity and absorbing elements in 35 programs of the LLVM test suite that present semiring patterns brings an average speedup of 1.19x (total optimized time over total unoptimized time) on top of clang -O3. When projected onto the entire test suite (259 programs) the optimization leads to a speedup of 1.025x. Once added onto clang, semiring optimizations approximates it to TACO, a specialized tensor compiler.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call