Abstract

Analysis of the exoplanet data is the top priority of astrophysicists today. With the increasing incoming information there is a need for an efficient and reliable algorithm. The data is taken from exoplanet data explorer which was cross checked and filtered with NASA’s known categorization. These were then sorted into 5 categories: Dwarfs, Terrestrial, Icy, Jovian and Giant planets. This paper compares expectation-maximization clustering algorithm as an unsupervised and logistic regression as a supervised machine learning methodologies. Comparatively, logistic regression outperformed EM, indicating it cannot be used to sort through the incoming data. Further analysis is necessary.

Highlights

  • Exoplanets are those celestial bodies that circle stars other than our Sun

  • The supervised machine learning algorithm chosen as a reference is multiclass logistic regression

  • Logistic regression of a two class case is defined as the posteriori probability of a class C as a sigmoid acting on a linear function of the feature column φ:

Read more

Summary

INTRODUCTION

Exoplanets are those celestial bodies that circle stars other than our Sun. They do not shine, are small comparing to their starry neighborhood, and far away from our observational point. The explorer allows for a fast and easy, graphical and numerical representation of the data It provides an overview of the algorithms that can be applied to the given data, as well as outputs, testing information and statistical breakdown of the results. These were decreased to 7 in preprocessing. Out of that 581 samples were with most data available, namely with ≤3 zeroes per row This is including the planets in our solar system. It was necessary to sort through known and available data to construct this feature

Dwarfs
Terrestrial
Icy or uncategorized
Jovian
Giants
CLASSIFICATION USING LOGISTIC REGRESSION
CLUSTERRING USING EXPECTATIONMAXIMIZATION ALGORITHM
Findings
COMPARISON AND CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call