In this paper, we investigate the beam domain statistical channel state information (CSI) estimation for the two-dimensional (2D) beam-based statistical channel model (BSCM) in massive multi-input multi-output (MIMO) systems. The problem is to estimate the beam domain channel power matrices (BDCPMs) based on multiple received pilot signals. A received signal model showing the relation between the statistical properties of the received pilot signals and the BDCPMs is derived. On the basis of the received signal model, we formulate an optimization problem with the Kullback-Leibler (KL) divergence. By solving the optimization problem, a novel method to estimate the statistical CSI without the estimates of instantaneous CSI is proposed. We further reduce the complexity of the proposed method by utilizing the circulant structures of particular matrices in the algorithm. We also showed the generality of the proposed method by introducing another application, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i.e</i> ., estimation of the angle domain channel power matrix. Simulation results show that the proposed method has good convergence and can obtain sparse BDCPMs. Compared with the regularized multiple measurement vector focal underdetermined system solver (RM-FOCUSS) algorithm, the proposed algorithm obtains overall more accurate statistical CSI with much lower complexity and brings significant performance gains when used in channel estimation.