We introduce a new carry look-ahead adder (NCLA) architecture that employs non-uniform-size carry look-ahead adder (CLA) modules, in contrast to the conventional CLA (CCLA) architecture, which utilizes uniform-size CLA modules. We adopted two strategies for the implementation of the NCLA. Our novel approach enables improved speed and energy efficiency for the NCLA architecture compared to the CCLA architecture without incurring significant area and power penalties. Various adders were implemented to demonstrate the advantages of NCLA, ranging from the slower ripple carry adder to the widely regarded fastest parallel-prefix adder viz. the Kogge–Stone adder, and their performance metrics were compared. The 32-bit addition was used as an example, with the adders implemented using a semi-custom design method and a 28 nm CMOS standard cell library. Synthesis results show that the NCLA architecture offers substantial improvements in design metrics compared to its high-speed counterparts. Specifically, an NCLA achieved (i) a 14.7% reduction in delay and a 13.4% reduction in energy compared to an optimized CCLA, while occupying slightly more area; (ii) a 42.1% reduction in delay and a 58.3% reduction in energy compared to a conditional sum adder, with an 8% increase in the area; (iii) a 14.7% reduction in delay and a 37.7% reduction in energy compared to an optimized carry select adder, while requiring 37% less area; and (iv) a 20.2% reduction in energy and a 55.4% reduction in area compared to the Kogge–Stone adder.