Nowadays, the confidentiality of data and information is of great importance for many companies and organizations. For this reason, they may prefer not to release exact data, but instead to grant researchers access to approximate data. For example, rather than providing the exact measurements of their clients, they may only provide researchers with grouped data, that is, the number of clients falling in each of a set of non-overlapping measurement intervals. The challenge is to estimate the mean and variance structure of the hidden ungrouped data based on the observed grouped data. To tackle this problem, this work considers the exact observed data likelihood and applies the Expectation-Maximization (EM) and Monte Carlo EM (MCEM) algorithms for cases where the hidden data follow a univariate, bivariate, or multivariate normal distribution. Simulation studies are conducted to evaluate the performance of the proposed EM and MCEM algorithms. The well-known Galton data set is considered as an application example.
Read full abstract