ABSTRACT In real-life scenarios, encountering data with missing values is common, and if not managed carefully from the outset of a study, it can lead to significant biases in survey estimates. Various methods exist for imputing missing values in sampling procedures. Ranked set sampling (RSS) is widely recognized for its superior efficiency compared to simple random sampling. However, limited research has been conducted on ranked set sampling in the presence of missing data. This article introduces novel imputation methods designed to estimate population means in the context of missing data under RSS. These innovative estimators are developed by integrating ratio, exponential, and logarithmic estimators judiciously. Expressions for the bias and mean squared error of the proposed estimators are derived up to the first-order approximation. Through simulation studies and an application to stunting and its determinants among children in Uttar Pradesh, India’s most populous state, the effectiveness of the suggested estimators in handling missing data is demonstrated. Numerical examples involving stunting in Uttar Pradesh, as well as simulated data generated using R software, confirm the superior performance of the proposed estimators over existing methods, as evidenced by comparisons of percentage relative efficiency and mean squared error. The results are promising, indicating improvement over all existing imputation methods. Additionally, pertinent recommendations are provided for survey professionals regarding future research.
Read full abstract