Abstract

Survey research is increasingly turning to online research and as a consequence, sampling only, or disproportionately from, households with internet access. While the percentage of non-internet households has declined, it persists at about one in ten households. This raises the question of coverage error and bias, and whether there is an approach to reduce possible biases in internet-only samples. We processed nearly 5000 variables from over a dozen major public and private datasets to assess the extent of the bias. Finding substantive differences, we then applied a series of data-reducing techniques to arrive at 38 variables that independently skew across internet and non-internet populations. We then developed and fielded a survey of these metrics to assess which dozen or less could be used to construct an efficient propensity model to reduce bias in internet-only samples. Our analyses revealed that many variables noted in prior research are important predictors of non-internet use, but also identified others. Our final propensity model of 10 variables was highly effective, reducing bias significantly. Many variables tested had bias reduced fourfold. Contributions: Prior research has not investigated the digital divide from a wide array of public datasets, nor done enough to consider the bias inherent in internet users-only samples. Our novel random-forest approach and subsequent survey of key candidate variables controlled for correlations among the variables in identifying the most important variables across multiple datasets. Our coding of 542 significant predictors of internet use contributes to the sociological understanding of internet access.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.