Abstract

One of the tasks of data science is the decomposition of large matrices in order to understand their structures. A special case of this is when we decompose relations, i.e., logical matrices. In this paper, we present a method based on the similarity of rows and columns, which uses correlation clustering to cluster the rows and columns of the matrix, facilitating the visualization of the relation by rearranging the rows and columns. In this article, we compare our method with Gunther Schmidt’s problems and solutions. Our method produces the original solutions by selecting its parameters from a small set. However, with other parameters, it provides solutions with even lower entropy.

Highlights

  • Correlation clustering is unique among clustering procedures in that it is based on similarity

  • We are using the upper approximation to calculate similarity of objects, which is a concept of rough set theory, so first we describe the main points of rough set theory

  • We present a method for decomposing heterogeneous relations that use correlation clustering and a similarity coefficient based on rough set theory

Read more

Summary

Introduction

There is a variant of the decomposition of matrices where we do not want to convert the matrix into a product, but break it down into independent, often homogeneous blocks by rearranging the rows and columns [3]. Since we want to group both the rows and columns of the matrix—these processes are not independent of each other, and we cut the matrix into blocks—this task has been given several names, e.g., biclustering, block clustering, etc. A natural generalization of this decomposition task is when the matrix contains not binary but real values; the decomposition gives a nice visualization of the clusters, as at the solution the sample variance of the blocks of the decomposition is minimal.

Theoretical Background
Rough Set Theory
Similarity
Correlation Clustering
Decomposing Heterogeneous Relations
Decomposition of Homogeneous Relations
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call