ABSTRACT Citizen science is a powerful analysis tool, capable of processing large amounts of data in a very short time. To bridge the gap between classification data products from web-based citizen science platforms to statistically robust signal significance scores, we present the Search Algorithm for Transits in the Citizen science Hunt for Exoplanets in Light curves (satchel) pipeline. This open source, customizable pipeline was constructed to identify and assign significance estimates to one-dimensional features marked by volunteers. We describe the functional capabilities of the satchel pipeline through application to features in photometric time-series data from the Kepler Space Telescope, classified by volunteers as part of the Planet Hunters citizen science project hosted on the Zooniverse platform. We evaluate the satchel pipeline’s overall performance based on recovery of known signals (both simulations and signals corresponding to official Kepler Objects of Interest) and relative contamination by spurious features. We find that, for a range of pipeline hyperparameters and with a reasonable score cutoff, satchel is able to recover volunteer identifications of over 98 per cent of signals from simulations corresponding to exoplanets >2 R⊕ in radius and about 85 per cent of signals corresponding to the same size range of KOIs. satchel is transparently adaptable to other citizen science classification data sets and available on GitHub.
Read full abstract