A four-way very long instruction word (VLIW), 312-MHz geometry processor with peripheral component interconnect/accelerated graphic port bus bridge was implemented in a 0.21-/spl mu/m, 2.5-V, three-layer-metal CMOS process. We adopted (1) a software bypass mechanism, (2) single-instruction multiple-data stream instructions, (3) four sets of floating-point multiply add and accumulate execution units, (4) special condition code registers and a branch condition generator for a clipping operation, and (5) automatic clock delay tuning methodology. As a result of these features, we achieved a performance of 2.5 GFLOPS and 6.5 million polygons per second for a three-dimensional geometry processor, which is the highest published performance as a single geometry processor. The processor is applicable to computer-aided-design systems that require very high graphics performance.