Abstract

We consider a regression model when a block of observations is missing, i.e. there are a group of observations with all the explanatory variables or covariates observed and another set of observations with only a block of the variables observed. We propose an estimator of the regression coefficients that is a combination of two estimators, one based on the observations with no missing variables, and the other the set all observations after deleting of the block of variables with missing values. The proposed combined estimator will be compared with the uncombined estimators. If the experimenter suspects that the variables with missing values may be deleted, a preliminary test will be performed to resolve the uncertainty. If the preliminary test of the null hypothesis that regression coefficients of the variables with missing value equal to zero is accepted, then only the data with no missing values are used for estimating the regression coefficients. Otherwise the combined estimator is used. This gives a preliminary test estimator. The properties of the preliminary test estimator and comparisons of the estimators are studied by a Monte Carlo study

Highlights

  • We consider a regression model with a block of observations missing

  • We propose an estimator that is a combination of two regression coefficient estimators, one based on the observations with no missing variables, and the other on all n observations after deleting the block of variables with missing values

  • If the preliminary test of the null hypothesis that regression coefficients of the variables with missing value equal to zero is accepted, only the data with no missing values are used for estimating the regression coefficients

Read more

Summary

Introduction

We consider a regression model with a block of observations missing. The model can be written as follows, i=1, 2, ... ,. Chien-Pai Han average (GPA) in graduate study on undergraduate GPA, Graduate Record Examination (GRE) scores and TOEFL score can be considered In this case the TOEFL scores on all US students are missing. We propose an estimator that is a combination of two regression coefficient estimators, one based on the observations with no missing variables, and the other on all n observations after deleting the block of variables with missing values. This estimating procedure is different from the usual procedure of imputation.

Combined Estimator of Regression Coefficients
Variable Selection after Testing the Regression Coefficients
Comparison of Estimators
Numerical Example
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call