The implementation of distributed network utility maximization (NUM) algorithms hinges heavily on information feedback through message passing among network elements. In practical systems the feedback is often obtained using error-prone measurement mechanisms and suffers from random errors. In this paper, we investigate the impact of noisy feedback on distributed NUM. We first study the distributed NUM algorithms based on the Lagrangian dual method, and focus on the primal-dual (P-D) algorithm, which is a single time-scale algorithm in the sense that the primal and dual parameters are updated simultaneously. Assuming strong duality, we study both cases when the stochastic gradients are unbiased or biased, and develop a general theory on the stochastic stability of the P-D algorithms in the presence of noisy feedback. When the gradient estimators are unbiased, we establish, via a combination of tools in Martingale theory and convex analysis, that the iterates generated by distributed P-D algorithms converge with probability one to the optimal point, under standard technical conditions. In contrast, when the gradient estimators are biased, we show that the iterates converge to a contraction region around the optimal point, provided that the biased terms are asymptotically bounded by a scaled version of the true gradients. We also investigate the rate of convergence for the unbiased case, and find that, in general, the limit process of the interpolated process corresponding to the normalized iterate sequence is a stationary reflected linear diffusion process, not necessarily a Gaussian diffusion process. We apply the above general theory to investigate stability of cross-layer rate control for joint congestion control and random access. Next, we study the impact of noisy feedback on distributed two time-scale NUM algorithms based on primal decomposition. We establish, via the mean ODE method, the convergence of the stochastic two time-scale algorithm under mild conditions, for the cases where the gradient estimators in both time scales are unbiased. Numerical examples are used to illustrate the finding that compared to the single time-scale counterpart, the two time-scale algorithm, although having lower complexity, is less robust to noisy feedback.