Consider the problem of finite-rate filtering of a discrete memoryless process {X/sub i/}/sub i/spl ges/1/ based on its noisy observation sequence {Z/sub i/}/sub i/spl ges/1/, which is the output of a discrete memoryless channel (DMC) whose input is {X/sub i/}/sub i/spl ges/1/. When the distribution of the pairs (X/sub i/,Z/sub i/), P/sub X,Z/, is known, and for a given distortion measure, the solution to this problem is well known to be given by classical rate-distortion theory upon the introduction of a modified distortion measure. We address the case where P/sub X,Z/, rather than being completely specified, is only known to belong to some set /spl Lambda/. For a fixed encoding rate R, we look at the worst case, over all /spl theta//spl isin//spl Lambda/, of the difference between the expected distortion of a given scheme which is not allowed to depend on the active source /spl theta//spl isin//spl Lambda/ and the value of the distortion-rate function at R corresponding to the noisy source /spl theta/. We study the minimum attainable value achievable by any scheme operating at rate R for this worst case quantity, denoted by D(/spl Lambda/, R). Linking this problem and that of source coding under several distortion measures, we prove a coding theorem for the latter problem and apply it to characterize D(/spl Lambda/, R) for the case where all members of /spl Lambda/ share the same noisy marginal. For the case of a general /spl Lambda/, we obtain a single-letter characterization of D(/spl Lambda/, R) for the finite-alphabet case. This gives, in particular, a necessary and sufficient condition on the set /spl Lambda/ for the existence of a coding scheme which is universally optimal for all members of /spl Lambda/ and characterizes the approximation-estimation tradeoff for statistical modeling of noisy source coding problems. Finally, we obtain D(/spl Lambda/, R) in closed form for cases where /spl Lambda/ consists of distributions on the (channel) input-output pair of a Bernoulli source corrupted by a binary-symmetric channel (BSC). In particular, for the case where /spl Lambda/ consists of two sources: the all-zero source corrupted by a BSC with crossover probability r and the Bernoulli(r) source with a noise-free channel; we find that universality becomes increasingly hard with increasing rate.