Abstract
Analysis of the exoplanet data is the top priority of astrophysicists today. With the increasing incoming information there is a need for an efficient and reliable algorithm. The data is taken from exoplanet data explorer which was cross checked and filtered with NASA’s known categorization. These were then sorted into 5 categories: Dwarfs, Terrestrial, Icy, Jovian and Giant planets. This paper compares expectation-maximization clustering algorithm as an unsupervised and logistic regression as a supervised machine learning methodologies. Comparatively, logistic regression outperformed EM, indicating it cannot be used to sort through the incoming data. Further analysis is necessary.
Highlights
Exoplanets are those celestial bodies that circle stars other than our Sun
The supervised machine learning algorithm chosen as a reference is multiclass logistic regression
Logistic regression of a two class case is defined as the posteriori probability of a class C as a sigmoid acting on a linear function of the feature column φ:
Summary
Exoplanets are those celestial bodies that circle stars other than our Sun. They do not shine, are small comparing to their starry neighborhood, and far away from our observational point. The explorer allows for a fast and easy, graphical and numerical representation of the data It provides an overview of the algorithms that can be applied to the given data, as well as outputs, testing information and statistical breakdown of the results. These were decreased to 7 in preprocessing. Out of that 581 samples were with most data available, namely with ≤3 zeroes per row This is including the planets in our solar system. It was necessary to sort through known and available data to construct this feature
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have