Abstract

Non-experts have long made important contributions to machine learning (ML) by contributing training data, and recent work has shown that non-experts can also help with feature engineering by suggesting novel predictive features. However, non-experts have only contributed features to prediction tasks already posed by experienced ML practitioners. Here we study how non-experts can design prediction tasks themselves, what types of tasks non-experts will design, and whether predictive models can be automatically trained on data sourced for their tasks. We use a crowdsourcing platform where non-experts design predictive tasks that are then categorized and ranked by the crowd. Crowdsourced data are collected for top-ranked tasks and predictive models are then trained and evaluated automatically using those data. We show that individuals without ML experience can collectively construct useful datasets and that predictive models can be learned on these datasets, but challenges remain. The prediction tasks designed by non-experts covered a broad range of domains, from politics and current events to health behavior, demographics, and more. Proper instructions are crucial for non-experts, so we also conducted a randomized trial to understand how different instructions may influence the types of prediction tasks being proposed. In general, understanding better how non-experts can contribute to ML can further leverage advances in Automatic machine learning and has important implications as ML continues to drive workplace automation.

Highlights

  • Recent years have seen improved technologies geared towards workplace automation and there is both promise and peril in how AI, robotics, and other technologies can alter the job security and prospects of the future workforce (David, 2015; Frank et al, 2019)

  • We study the topics and properties of prediction tasks (RQ2), show that performant predictive models can be trained on some proposed tasks (RQ5), and explore the design of the problem of proposing prediction tasks using a randomized trial (RQ3)

  • Analyzing the proposed prediction tasks, we found that non-experts tended to favor Boolean questions over numeric questions, that input questions tended to be positively correlated with target questions, and that many prediction tasks were related to demographic or personal attributes

Read more

Summary

Introduction

Recent years have seen improved technologies geared towards workplace automation and there is both promise and peril in how AI, robotics, and other technologies can alter the job security and prospects of the future workforce (David, 2015; Frank et al, 2019). Machine learning requires experts to understand technical concepts from probability and statistics, linear algebra and optimization; be able to perform predictive model construction, training, and diagnostics; and even participate in data collection, cleaning, and validation (Domingos, 2012; Alpaydin, 2020). Such a depth of pre-requisite knowledge may limit the roles of non-experts, yet fields such as AutoML (Hutter, Kotthoff & Vanschoren, 2019) have the promise to further enable non-expert participation in ML by removing many of the challenges which non-experts may be unable to address without training or experience (Feurer et al, 2015; Vanschoren et al, 2014). As remarked by Yang et al (2018), despite extensive research in these areas, little work has investigated how non-experts can take creative or editorial control to design their own applications of ML

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call