A popular technique of designing multiple-input multiple-output (MIMO) communication systems relies on optimizing the positive semidefinite covariance matrix at the source. In this paper, a unified MIMO optimization framework based on the Karush-Kuhn-Tucker (KKT) conditions is proposed. In this framework, with the aid of matrix optimization theory, <xref ref-type="theorem" rid="theorem1" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Theorem 1</xref> presents a generic optimal transmit covariance matrix for MIMO systems with diverse objective functions subject to various power constraints and different levels of channel state information (CSI). Specifically, <xref ref-type="theorem" rid="theorem1" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Theorem 1</xref> fundamentally reveals that for a diverse family of MIMO systems, the optimal transmit covariance matrices associated with different objective functions under various power constraints can be derived in a unified generic water-filling-like form. When applying <xref ref-type="theorem" rid="theorem1" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Theorem 1</xref> to the case of multiple general power constraints, we firstly equivalently transform multiple power constraints into a single counterpart by introducing multiple weighting factors based on Pareto optimization theory. The optimal weighting factors can be found by the proposed modified subgradient method. On the other hand, for the imperfect MIMO system with statistical CSI errors, we firstly address the non-convexity of the robust optimization problem by following the idea of alternating optimization. Finally, our numerical results verify the optimal solution structure in <xref ref-type="theorem" rid="theorem1" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Theorem 1</xref> and the global optimality of the proposed modified subgradient method, as well as demonstrate the performance advantages of the proposed alternating optimization algorithm.