The effect of gain and embedding of amplifying cells (amp-cell) on the output power of power amplifiers (PAs) at high mm-wave frequencies is studied. This is the frequency range where matching loss becomes comparable with the gain of the amp-cell in most silicon technologies. By deriving power equations of embedded amp-cell, power contours are plotted in the gain plane and an optimum embedding is designed to maximize the output power for a desired gain. To showcase the theory, a high-frequency, high-power amp-cell, called matched cascode, is introduced and subsequently embedded to boost both power gain and output power. To increase the output power even further, a differential slot power combiner (SPC) is introduced and its equivalent circuit is analyzed. Finally, using the embedded matched cascode cell, and the SPC, a <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$2 \times 8$ </tex-math></inline-formula> PA is implemented in 65-nm bulk CMOS. It consumes 732 mW from 2.4-V supply voltage, with a maximum power-added efficiency (PAE) of 1.03%. The PA features a <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$P_{\text {sat}}$ </tex-math></inline-formula> and OP1dB of 9.4 and 6.3 dBm, respectively, at 200 GHz, and a maximum power gain of 19.5 dB.