Space-time variational formulations and adaptive Wiener–Hermite polynomial chaos Galerkin discretizations of Kolmogorov equations in infinite dimensions, such as Fokker–Planck and Ornstein–Uhlenbeck equations for functions defined on an infinite-dimensional separable Hilbert space $$H$$ , are developed. The well-posedness of these equations in the Hilbert space $$\mathrm{L}^2(H,\mu )$$ of functions on the infinite-dimensional domain $$H$$ , which are square-integrable with respect to a Gaussian measure $$\mu $$ with trace class covariance operator $$Q$$ on $$H$$ , is proved. Specifically, for the infinite-dimensional Fokker–Planck equation, adaptive space-time Galerkin discretizations, based on a wavelet polynomial chaos Riesz basis obtained by tensorization of biorthogonal piecewise polynomial wavelet bases in time with a spatial Wiener–Hermite polynomial chaos arising from the Wiener–Itô decomposition of $$\mathrm{L}^2(H,\mu )$$ , are introduced. The resulting space-time adaptive Wiener–Hermite polynomial Galerkin discretization algorithms of the infinite-dimensional PDE are proved to converge quasioptimally in the sense that they produce sequences of finite-dimensional approximations that attain the best possible convergence rates afforded by best $$N$$ -term approximations of the solution from tensor-products of multiresolution (wavelet) time-discretizations and the Wiener–Hermite polynomial chaos in $$\mathrm{L}^2(H,\mu )$$ . As a consequence, the proposed adaptive Galerkin solution algorithms exhibit dimension-independent performance, which is optimal with respect to the algebraic best $$N$$ -term rate afforded by the solution and the polynomial degree and regularity of the multiresolution (wavelet) time-discretizations in the finite-dimensional case, in particular. All constants in our error and complexity bounds are shown to be independent of the number of “active” coordinates identified by the proposed adaptive Galerkin approximation algorithms. The computational work and memory required by the proposed algorithms scale linearly with the support size of the coefficient vectors that arise in the approximations, with dimension-independent constants.