The tripartite industry classification, which divides all economic activities into three parts, is a classification method to reflect the dynamic process of economic development and the historical trend of the change of resource allocation structure.The fact shows that the proportion of each industry has become an important symbol of the level of national economic development. The proportion of each industry is compositional data,which is a kind of complex multidimensional data used in many fields. All components in the compositional data are non-negative and carry only relative information. In practice, there could be missing values in compositional data. However, general statistical analysis methods cannot be firstly used for compositional data with missing values. The complexity of the missing value of compositional data makes traditional imputation methods no longer suitable. Thus, how to carry out effective statistical inference for compositional data with missing values attracts the attention of many scholars, recently. In this paper, we focus on the imputation problem in compositional data containing missing values, and propose an Adaptive Least Absolute Shrinkage and Selection Operator (ALASSO) imputation method to obtain a complete datasets through variable selection and parameter estimation. Then, the new method is simulated and empirically analyzed, and a comparative study with mean imputation, k-nearest neighbor imputation, and iterative regression imputation is conducted. The results show that the ALASSO imputation method has the highest accuracy for different missing rates, dimensions and correlation coefficients.
Read full abstract