Abstract

In several applications, data information is obtained in the form of intervals, such as the monthly temperature in a meteorological station or daily pollution levels in different locations. This paper proposes partitioning clustering algorithms for interval-valued data based on adaptive Euclidean and City-Block distances. Since some boundary variables may be more relevant for the clustering process, the proposals consider the joint weights of the relevance of the lower and upper boundaries of the interval-valued variables. Consequently, clusters of different shapes and sizes in some subspaces of the variables, even in specific boundaries of the interval-valued data, can be recognized. In addition, robust dissimilarity functions were introduced to reduce the influence of outliers in the data. The adaptive distances change at each iteration of the algorithms and can be different from one cluster to another. The methods optimize an objective function by alternating three steps for obtaining the representatives of each group, the cluster partition, and the relevance weights for the interval-valued variables. Experiments on synthetic and real data sets corroborate the robustness and usefulness of the proposed adaptive clustering methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call