In this paper, an approach to synthesize compressor trees in high-level synthesis is proposed. We target the modern field-programmable gate arrays, which integrate carry chains and support fast ternary adders. Two main improvements are achieved in our approach: 1) based on the proposed modified bitmask analysis, we perform bit-level numerical optimizations to shrink the scale of generated compressor trees and achieve a better area-delay performance; 2) by estimating the arrival time of each multi-input addition operand, we combine the use of generalized parallel counters and ternary adders in compressor trees to further reduce the area while maintaining a similar delay performance. A series of experiments shows that our approach reduces the area significantly while maintaining similar delay performance, as compared to the existing approaches.
Read full abstract