Abstract

The nonserial polyadic dynamic programming algorithm is one of the most fundamental algorithms for solving discrete optimization problems. Although the loops in the nonserial polyadic dynamic programming algorithm are similar to those in matrix multiplication, the available automatic optimization techniques have little effect on this imperfect loop because of nonuniform data dependencies. In this paper, we develop algorithmic optimizations to improve the cache performance of the nonserial polyadic dynamic programming algorithm. Our algorithmic transformation takes advantage of the cache oblivious method by relaxing some dependencies in the standard iterative version. Based on the ideal cache model of the cache oblivious algorithm, the approximate bound of cache misses is given by $\Theta(\frac{n^{3}Z}{L\sqrt{Z}}+\frac{n^{2}}{L}+\frac{n}{L\sqrt{Z}})$ . We also found that the optimized algorithm with the cache oblivious approach is more sensitive to conventional optimization techniques such as tiling. Experimental results on several platforms show that the optimized algorithms improve the cache performance and achieves speedups of 2---10 times.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call