This paper deals with linear-quadratic optimal control problems constrained by a parametric or stochastic elliptic or parabolic partial differential equation (PDE). We address the (difficult) case that the state equation depends on a countable number of parameters i.e., on $\sigma_j$ with $j\in\mathbb{N}$, and that the PDE operator may depend nonaffinely on the parameters. We consider tracking-type functionals and distributed as well as boundary controls. Building on recent results in [A. Cohen, R. DeVore, and Ch. Schwab, Found. Comput. Math., 10 (2010), pp. 615--646; Anal. Appl., 9 (2011), pp. 1--37], we show that the state and the control are analytic as functions depending on these parameters $\sigma_j$. We establish sparsity of generalized polynomial chaos (gpc) expansions of both state and control in terms of the stochastic coordinate sequence $\sigma = (\sigma_j)_{j\geq 1}$ of the random inputs, and we prove convergence rates of best $N$-term truncations of these expansions. Such truncations are the key for subsequent computations since they do not assume that the stochastic input data has a finite expansion. In a follow-up paper [A. Kunoth and Ch. Schwab, Sparse adaptive tensor Galerkin approximations of stochastic PDE-constrained control problems, in preparation], we explain two methods for how such best $N$-term truncations can be computed practically: by greedy-type algorithms as in [Ch. Schwab and C. J. Gittelson, Acta Numer., 20 (2011), pp. 291--467; C. J. Gittelson, Report 2011-12, Seminar for Applied Mathematics, ETH Zürich, Zürich, Switzerland, 2011], or by multilevel Monte-Carlo methods as in [F. Y. Kuo, Ch. Schwab, and I. H. Sloan, SIAM J. Numer. Anal., 50 (2012), pp. 3351--3374]. The sparsity result allows, in conjunction with adaptive wavelet Galerkin schemes, for sparse, adaptive tensor discretizations of control problems constrained by linear elliptic and parabolic PDEs developed in [W. Dahmen and A. Kunoth, SIAM J. Control Optim., 43 (2005), pp. 1640--1675; M. D. Gunzburger and A. Kunoth, SIAM J. Control. Optim., 49 (2011), pp. 1150--1170; A. Kunoth, Numer. Algorithms, 39 (2005), pp. 199--220]; see [A. Kunoth and Ch. Schwab, Sparse adaptive tensor Galerkin approximations of stochastic PDE-constrained control problems, in preparation].