Move-based iterative improvement partitioning (IIP) methods, such as the Fiduccia-Mattheyses (FM) algorithm [Fidducia and Mattheyses 1982] and Krishnamurthy's Look-Ahead (LA) algorithm [Krishnamurthy 1984], are widely used in VLSI CAD applications, largely due to their time efficiency and ease of implementation. This class of algorithms is of the "local/greedy improvement" type, and they generate relatively high-quality results for small and medium-size circuits. However, as VLSI circuits become larger, these algorithms suffer a rapid deterioration in solution quality. We propose new IIP methods CLIP and CDIP that select cells to move with a view to moving clusters that straddle the two subsets of a partition, into one of the subsets. The new algorithms significantly improve partition quality while preserving the advantage of time efficiency. Experimental results on 25 medium to large-size ACM/SIGDA benchmark circuits show up to 70% improvement over FM in mincut, and average mincut improvements of about 35% over all circuits and 47% over large circuits. They also outperform state-of-the-art non-IIP techniques, the quadratic-programming-based method Paraboli [Reiss et al. 1994] and the spectral partitioner MELO [Alpert and Yao 1995], by about 17% and 23%, respectively, with less CPU time. This demonstrates the potential of sophisticated IIP algorithms in dealing with the increasing complexity of emerging VLSI circuits. We also compare CLIP and CDIP to hMetis [Karypis et al. 1997], one of the best of the recent state-of-the-art partitioners that are based on the multilevel paradigm (others include ML c [Alpert et al. 1997] and LSR/MFFS [Cong et al. 1997]). The results show that one scheme of hMetis is 8% worse than CLIP/CDIP and the other two schemes are only 2--4% better. However, CLIP/CDIP have advantages over hMetis and other multilevel partitioners that outweigh these minimal mincut improvements. The first is much faster times-to-solution (for example, one of our best schemes CLIP-LA2 is 6.4 and 11.75 times faster than the two best hMetis schemes) and much better scalability with circuit size (e.g., for the largest circuit with about 162K nodes, CLIP-LA2 is 10.4 and and 21.5 times faster and obtains better solution qualities than the two best hMetis schemes). Second, CLIP/CDIP are "flat" partitioners, while multilevel techniques perform a sequence of node clustering/coarsening before partitioning the circuit. In complex placement applications such as timing-driven placement in the presence of multiple constraints, such circuit coarsening can hide crucial information needed for good-quality solutions, thus making the partitioning process oblivious to them. This, however, is not a problem with flat partitioners like CLIP/CDIP that can take all important parameters into account while partitioning. All these advantages make CLIP/CDIP suitable for use in complex physical design problems for large, deep-submicron VLSI circuits.
Read full abstract