Abstract

Pulsar candidate sifting is an essential part of pulsar analysis pipelines for discovering new pulsars. To solve the problem of data mining of a large number of pulsar data using a Five-hundred-meter Aperture Spherical radio Telescope (FAST), a parallel pulsar candidate sifting algorithm based on semi-supervised clustering is proposed, which adopts a hybrid clustering scheme based on density hierarchy and the partition method, combined with a Spark-based parallel model and a sliding window-based partition strategy. Experiments on the two datasets, HTRU (The High Time-Resolution Universe Survey) 2 and AOD-FAST (Actual Observation Data from FAST), show that the algorithm can excellently identify the pulsars with high performance: On HTRU2, the Precision and Recall rates are 0.946 and 0.905, and those on AOD-FAST are 0.787 and 0.994, respectively; the running time on both datasets is also significantly reduced compared with its serial execution mode. It can be concluded that the proposed algorithm provides a feasible idea for astronomical data mining of FAST observation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call