Abstract As a new generation of large-sky spectroscopic surveys comes online, the enormous data volume poses unprecedented challenges in classifying spectra. Modern unsupervised techniques have the power to group spectra based on their dominant features, circumventing the complete reliance on training data suffered by supervised methods. We outline the use of dimensionality reduction to generate a 2D map of the structure of an intermediate-resolution spectroscopic dataset. This technique efficiently separates white dwarfs of different spectral classes in the Dark Energy Spectroscopic Instrument’s Early Data Release (DESI EDR), identifying spectral features that had been missed even by visual classification. By focusing the method on particular spectral regions, we identify white dwarfs with helium features at 90percnt recall, and cataclysmic variables at 100percnt recall, illustrating rapid selection of low-contamination samples from spectroscopic surveys. We also demonstrate the use of dimensionality reduction in a supervised manner, outlining a procedure to classify any white dwarf spectrum in comparison with those in the DESI EDR. With upcoming surveys promising tens of millions of spectra, our work highlights the potential for semi-supervised techniques as an efficient means of classification and dataset visualisation.
Read full abstract