Abstract

Clusterization is one of the types of algorithms of unsupervised learning. The idea behind it is that an algorithm learns patterns from untagged data. Such type of algorithm helps to find unseen dependencies in the untagged data itself. This paper presented algorithms based on Breadth-First Search or BFS for a Graph. The method was built based on the basic theory of clusterization. To the theory of clusterization, the calculated distance between the two farthest points in the cluster should be less than the distance between the closest two points from different clusters. By this rule, we defined that two parameters of the method should be the maximum distance between points by which these can be connected and assumed to be in one cluster. The second had to be the maximum distance in the cluster, aka the cluster’s diameter. A cluster’s diameter is the farthest distance between two points within a cluster. With these hyperparameters and the defined distance method, we can assume that every point is a vertex of a graph, two points within the threshold of the distance between pairs of ones are neighbours, and count the connection between counts as an edge of a graph. The group of connected vertexes or a particular vertex is a graph. The diameter hyperparameter ought to keep the data homogeneity in a cluster. We can define every graph as a cluster with defined rules based on previous assumptions. Later in this paper will be visualized the clusterization of three-dimensional data points. We took one of the most popular clusterization dataset – the iris dataset for visualizing purposes. The paper contains several examples of clusterization of the dataset with different hyperparameters. We took KMeans [3] as an example of the clusterization method. The method based on BFS is a flexible clusterization method that relies on meta-information about distancing between data points.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call