Machine learning based stellar classification with highly sparse photometry data.

Seán Enis Cody,Sebastian Scher,Iain Mcdonald,Albert Zijlstra,Emma Alexander,Nick Cox

doi:10.12688/openreseurope.17023.2

Abstract

Identifying stars belonging to different classes is vital in order to build up statistical samples of different phases and pathways of stellar evolution. In the era of surveys covering billions of stars, an automated method of identifying these classes becomes necessary. Many classes of stars are identified based on their emitted spectra. In this paper, we use a combination of the multi-class multi-label Machine Learning (ML) method XGBoost and the PySSED spectral-energy-distribution fitting algorithm to classify stars into nine different classes, based on their photometric data. The classifier is trained on subsets of the SIMBAD database. Particular challenges are the very high sparsity (large fraction of missing values) of the underlying data as well as the high class imbalance. We discuss the different variables available, such as photometric measurements on the one hand, and indirect predictors such as Galactic position on the other hand. We show the difference in performance when excluding certain variables, and discuss in which contexts which of the variables should be used. Finally, we show that increasing the number of samples of a particular type of star significantly increases the performance of the model for that particular type, while having little to no impact on other types. The accuracy of the main classifier is ∼0.7 with a macro F1 score of 0.61. While the current accuracy of the classifier is not high enough to be reliably used in stellar classification, this work is an initial proof of feasibility for using ML to classify stars based on photometry.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Machine learning based stellar classification with highly sparse photometry data.

Abstract

Talk to us

Similar Papers

More From: Open research Europe

Lead the way for us

Journal: Open research Europe	Publication Date: Aug 28, 2024
License type: CC BY 4.0

Similar Papers

Machine learning based stellar classification with highly sparse photometry data
Seán Enis Cody ... Albert Zijlstra
Open Research Europe | VOL. 4
Seán Enis Cody, et. al.Seán Enis Cody ... Albert Zijlstra
16 Feb 2024
Open Research Europe | VOL. 4

Performance of multilabel machine learning models and risk stratification schemas for predicting stroke and bleeding risk in patients with non-valvular atrial fibrillation
Juan Lu ... Girish Dwivedi
Computers in Biology and Medicine | VOL. 150
Juan Lu, et. al.Juan Lu ... Girish Dwivedi
22 Sep 2022
Computers in Biology and Medicine | VOL. 150

Predicting TCM patterns in PCOS patients: An exploration of feature selection methods and multi-label machine learning models
Jiekee Lim ... Zhaoxia Xu
Heliyon | VOL. 10
Jiekee Lim, et. al.Jiekee Lim ... Zhaoxia Xu
26 Jul 2024
Heliyon | VOL. 10

A Microcosmic Syndrome Differentiation Model for Metabolic Syndrome with Multilabel Learning.
Shujie Xia ... Long Zhu
Evidence-based complementary and alternative medicine : eCAM | VOL. 2020
Shujie Xia, et. al.Shujie Xia ... Long Zhu
01 Jan 2020
Evidence-based complementary and alternative medicine : eCAM | VOL. 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Machine learning based stellar classification with highly sparse photometry data.

Abstract

Talk to us

Similar Papers

More From: Open research Europe