Large sky Area Multi-Object fiber Spectroscopic Telescope (LAMOST) has completed the observation of nearly 20 million celestial objects, including a class of spectra labeled “Unknown.” Besides low signal-to-noise ratio, these spectra often show some anomalous features that do not work well with current templates. In this paper, a total of 637,889 “Unknown” spectra from LAMOST DR5 are selected, and an unsupervised-based analytical framework of “Unknown” spectra named SA-Frame (Spectra Analysis-Frame) is provided to explore their origins from different perspectives. The SA-Frame is composed of three parts: NAPC-Spec clustering, characterization and origin analysis. First, NAPC-Spec (Nonparametric density clustering algorithm for spectra) characterizes different features in the “unknown” spectrum by adjusting the influence space and divergence distance to minimize the effects of noise and high dimensionality, resulting in 13 types. Second, characteristic extraction and representation of clustering results are carried out based on spectral lines and continuum, where these 13 types are characterized as regular spectra with low S/Ns, splicing problems, suspected galactic emission signals, contamination from city light and un-gregarious type respectively. Third, a preliminary analysis of their origins is made from the characteristics of the observational targets, contamination from the sky, and the working status of the instruments. These results would be valuable for improving the overall data quality of large-scale spectral surveys.
Read full abstract