Recovering geologically realistic physical property models by geophysical inversion is a long-standing challenge. Generative neural networks offer a promising path to meet this challenge because they can produce spatially complex models that exhibit the characteristics of a set of training models, even when those characteristics are difficult to quantify. In the context of geophysical inversion, these characteristics may include faults, layers, and sharp contacts between rock units. Here, we develop a framework for incorporating prior geologic knowledge into geophysical inversions using conditional variational autoencoders (CVAEs). We train a CVAE to reconstruct training density models while honoring relative gravity data. Once trained, the decoder network of the CVAE inverts gravity data. The inputs to the decoder are observed gravity data and a set of latent variables that are sampled from a standard normal distribution. The decoder maps from the observed data and latent variables to density models such that the resulting models are consistent with the training models and the input data. Consequently, the inversion fits the observed data and incorporates the information embedded in the training models. The decoder can produce many inverted models instantaneously, sampling an approximation to the posterior model distribution efficiently. We find that the latent variables correspond to independent interpretable ways in which the model can vary while still honoring the observed data. We draw a connection to linear inverse theory, positing that the latent variables are analogous to the principal components of the local posterior model covariance.