We study the problem of providing channel state information (CSI) at the transmitter in multi-user “massive” MIMO systems operating in frequency division duplexing (FDD). The wideband MIMO channel is a vector-valued random process correlated in time, space (antennas), and frequency (subcarriers). The base station (BS) broadcasts periodically β <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">tr</sub> pilot symbols from its <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">M</i> antenna ports to <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">K</i> single-antenna users (UEs). Correspondingly, the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">K</i> UEs send feedback messages about their channel state using β <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">fb</sub> symbols in the uplink (UL). Using results from remote rate-distortion theory, we show that, as snr → ∞, the optimal feedback strategy achieves a channel state estimation mean squared error (MSE) that behaves as Θ(1) if β <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">tr</sub> < <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">r</i> and as Θ(snr <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">–α</sup> ) when β <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">tr</sub> ⩾ <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">r</i> , where α = min(β <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">fb</sub> / <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">r</i> , 1), where <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">r</i> is the rank of the channel covariance matrix. The MSE-optimal rate-distortion strategy implies encoding of long sequences of channel states, which would yield completely stale CSI and therefore poor multiuser precoding performance. Hence, we consider three practical “one-shot” CSI strategies with minimum one-slot delay and analyze their large-SNR channel estimation MSE behavior. These are: (1) digital feedback via entropy-coded scalar quantization (ECSQ), (2) analog feedback (AF), and (3) local channel estimation at the UEs via compressed sensing and digital feedback. These schemes have different requirements in terms of knowledge of the channel statistics at the UE and at the BS. In particular, the latter strategy requires no statistical knowledge and is closely inspired by a CSI feedback scheme currently proposed in 3GPP standardization. It is shown that ECSQ achieves optimal MSE at the price of a slight increase in feedback rate which vanishes for large SNR. AF achieves the optimal MSE decay rate of Θ(snr <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">–1</sup> ) whenever β <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">tr</sub> , β <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">fb</sub> ⩾ <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">r</i> but is sub-optimal if β⩾ <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">r</i> and β <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">fb</sub> < <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">r</i> . The <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3GPP-inspired</i> scheme is shown, via numerical simulations, to achieves performance similar to ECSQ and AF when the multipath channel is sufficiently sparse in the angle-delay domain, but suffers from a large performance gap if this requirement is not met.