This paper presents a power/delay/area performance-improved radix-4 8 × 8 Booth multiplier. The major modification for reducing delay is a parallel structure for the addition of encoded partial products. Additional enhancements include an optimized Booth encoder, an optimized B2C design, and a unique square root carry-select adder with carry-lookahead adder logic to minimize multiplier’s delay. This design achieved a reduction of 26.6% in power consumption, 15% in area consumption, and 25.6% in data arrival time compared to recently published similar designs. All the proposed circuits were designed and synthesized in Synopsys CMOS 32 nm technology.