Abstract

In the field of Astrostatistics, clustering and classification of different astronomical objects play a very important role. In cluster analysis, the objective is to group the items such that items in the same cluster are more closely related than those assigned to different clusters. The total number of clusters in the data set may be known in some cases and maybe unknown in others. There are different methods available for clustering, which can be further categorized under supervised and unsupervised learning techniques. In the case of supervised learning, there are some model assumptions but in the case of unsupervised learning, there are no such assumptions. Under both the above-mentioned categories, for clustering and classification, various methods have been developed depending on the nature of the data sets. However, generally, it is difficult to compare the performances of the different techniques. Here we have tried to compare the applicability of some of the clustering techniques on a galaxy data set. To justify the robustness of the variety of unsupervised methods used in our work, a few post-classification techniques are used as supervised learning. Finally, the comparability of clusters, obtained by different techniques, is studied with respect to an ad-hoc techniqueand they are further justified in terms of astrophysical properties of the galaxies. Our main focus is on unsupervised machine learning algorithms, which are used to perform dimensionality reduction, cluster analysis, visualization and to get an idea regarding the best-unsupervised technique that is appropriate for a galaxy data set. It is found that K-means performs best for the galaxy data set under consideration.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call