This paper describes a high performance multiply-accumulator (MAC) unit in DSP-core. Since the most critical timing path of the DSP lies in the MAC, great endeavors have been paid to speed it up. The MAC unit can perform fixed-point operation with rounding optional, on operands with a throughput of 1 cycle. In this design, the modified sign extension algorithm is presented to eliminate the sign bits array of partial products to reduce computation time and area. Further increase in speed is achieved by using a new three 40 bit inputs (also known as operands) high-speed arithmetic logical unit (ALU) to shorten the delay of the critical path.