Abstract

Clustering is a machine learning tool used to cluster similar data based on the similarities in its characteristics. Clustering techniques are used when the specific target or the expected output is not known to the data analyst. Hierarchical clustering is a series of partitions running from a single cluster or reversely a single large cluster can be iteratively divided into smaller clusters. There are two types of hierarchical clustering: agglomerative clustering; and divisive clustering. Basically, there are two types of clustering, namely, hard clustering and soft clustering. The K-means algorithm performs hard clustering—the data points are assigned to only one cluster based on their distances from the centroid of the cluster. Representative-based clustering partitions the given data set with n data points in an N-dimensional space. There are three types of outlier detection techniques based on the availability of training data set. They are: supervised outlier detection; semi-supervised outlier detection; and unsupervised outlier detection.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.