Abstract

Sample selection models are employed when an outcome of interest is observed for a restricted non-randomly selected sample of the population. We consider the case in which the response is binary and continuous covariates have a nonlinear relationship to the outcome. We introduce two statistical methods for the estimation of two binary regression models involving semiparametric predictors in the presence of non-random sample selection. This is achieved using a multiple-stage procedure, and a newly developed simultaneous equation estimation scheme. Both approaches are based on the penalized likelihood estimation framework. The problems of identification and inference are also discussed. The empirical properties of the proposed approaches are studied through a simulation study. The methods are then illustrated using data from the American National Election Study where the aim is to quantify public support for school integration. If non-random sample selection is neglected then the predicted probability of giving, for instance, a supportive response may be biased, an issue that can be tackled using the proposed tools.

Highlights

  • Sample selection techniques are employed when observations are not from a random sample of the population

  • Since the parameters of the selection equation are not in principle affected by bias, we focus on the estimation results for the outcome equation

  • There are some differences in the significance of the parametric terms of the sample selection models, the magnitude and sign of the coefficients are similar

Read more

Summary

Introduction

Sample selection techniques are employed when observations are not from a random sample of the population. If the respondents who opposed government involvement in school integration chose not to answer the question, because they felt their opinion might be perceived as socially unacceptable, the sample of individuals who provided an opinion may have differed in systematic ways from the sample of non-respondents. To clarify this (often misunderstood) concept, let us characterize each individual by some observed and unobserved features or confounders. If the responding and nonresponding subsamples have similar characteristics, the issue of non-random sample selection does not arise since the average (observed and unobserved) features of the responding sample are similar to those of the population. If the decision to answer is no longer random, because of differing characteristics between the responding and nonresponding individuals, biased analyses are

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.