Abstract

The reliance on small samples and underpowered studies may undermine the replicability of scientific findings. Large sample sizes may be necessary to achieve adequate statistical power. Crowdsourcing sites such as Amazon’s Mechanical Turk (MTurk) have been regarded as an economical means for achieving larger samples. Because MTurk participants may engage in behaviors which adversely affect data quality, much recent research has focused on assessing the quality of data obtained from MTurk samples. However, participants from traditional campus- and community-based samples may also engage in behaviors which adversely affect the quality of the data that they provide. We compare an MTurk, campus, and community sample to measure how frequently participants report engaging in problematic respondent behaviors. We report evidence that suggests that participants from all samples engage in problematic respondent behaviors with comparable rates. Because statistical power is influenced by factors beyond sample size, including data integrity, methodological controls must be refined to better identify and diminish the frequency of participant engagement in problematic respondent behaviors.

Highlights

  • Concerns have been raised in recent years about the replicability of published scientific studies and the accuracy of reported effect sizes, which are often distorted as a function of underpowered research designs [1,2,3,4]

  • We examined whether Mechanical Turk (MTurk) participants engaged in potentially problematic respondent behaviors with greater frequency than participants from more traditional laboratory-based samples, and whether behavior among participants from more traditional samples is uniform across different laboratory-based sample types

  • The first orthogonal contrast revealed that MTurk participants were more likely than campus and community participants to complete a study while multitasking (t(512) = -5.90, p = 6.76E-9, d = .52), to leave the page of a study to return at a later point in time (t(512) = -4.72, p = 3.01E-6, d = .42), to look for studies by researchers they already know (t(512) = -9.57, p = 4.53E-20, d = .85), and to contact a researcher if they find a glitch in their survey (t(512) = -3.35, p = .001, d = .30)

Read more

Summary

Introduction

Concerns have been raised in recent years about the replicability of published scientific studies and the accuracy of reported effect sizes, which are often distorted as a function of underpowered research designs [1,2,3,4]. Data collected on MTurk have been shown to be generally comparable to data collected in the laboratory and the community for many psychological tasks, including cognitive, social, and judgment and decision making tasks [10,11,12,13]. This has generally been taken as evidence that data from MTurk are of high quality, reflecting an assumption that laboratory-based data collection is a gold standard in scientific research

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.