In processor design, the multiplier serves as a critical component whose operational speed and efficiency directly impact the performance of the processor. To meet the demands of rapidly advancing technology, enhancing processor performance is of paramount importance. The crux of multiplier design lies in reducing the count of partial products and compressing them. This paper presents the design of a multiplier that utilizes the Booth algorithm and the Wallace tree structure for optimization, along with the incorporation of registers for secondary pipeline processing to further elevate efficiency. The Booth algorithm selects the base-4 Booth algorithm, effectively reducing the count of partial products and mitigating the optimization efficiency reduction caused by circuit complexity. The Wallace tree structure employs a combination of 3-2 and 4-2 compressors, resulting in decreased resource consumption and reduced critical path delays. This paper outlines a step-by-step introduction to these three optimization methods and conducts simulations and tests on the current multiplier after each optimization step. Through simulation analysis, this paper confirms the success of the design and provides insights and outcomes to the current field of multiplier optimization, aiming to ultimately drive advancements in processor performance.