In this paper, a data-driven non-parametric approach is presented for forecasting the probability density evolution of stochastic dynamical systems. The method is based on stochastic Koopman operator and extended dynamic mode decomposition (EDMD). To approximate the finite-dimensional eigendecomposition of the stochastic Koopman operator, EDMD is applied to the training data set sampled from the stationary distribution of the underlying stochastic dynamical system. The family of the Koopman operators form a semigroup, which is generated by the infinitesimal generator of the stochastic dynamical system. A significant connection between the generator and Fokker-Planck operator is leveraged to construct an orthonormal basis of a weighted Hilbert space, enabling a spectral decomposition of the probability density function to be performed in this space. This approach is a data-driven method and used to predict the probability density evolution and real-time moment estimation. In the limit of the large number of snapshots and observables, the data-driven probability density approximation converges to the Galerkin projection of the semigroup solution of Fokker-Planck equation on a basis adapted to an invariant measure. This method shares similarities with diffusion forecasting but is capable of producing more accurate probability density forecasts than diffusion forecasting. A few numerical examples are presented to illustrate the performance of the proposed data-driven probability density forecast.