Abstract

This paper proposes an exact, no-assumptions approach to dealing with incomplete sets of multivariate categorical data. An incomplete data set is regarded as a finite collection of complete data sets, and a joint distribution is obtained from each of them, at a descriptive level. The tools to simultaneously treat all the possible joint distributions compatible with an incomplete set of data are given. In particular, a linear description of the set of distributions is formulated, and it is shown that the computation of bounds on the expectation of real-valued functions under such distributions is both possible and efficient, by means of linear programming. Specific algorithms are also developed whose complexity grows linearly in the number of observations. An analysis is then carried out to estimate population probabilities from incomplete multinomial samples. The descriptive tool extends in a straightforward way to the inferential problem by exploiting Walley's imprecise Dirichlet model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call