Abstract

Correlation clustering (CC) is a clustering method using a signed graph as input without specifying the number of clusters a priori. It had been widely used in real applications, such as social network and text mining. However, its exact optimization or approximate algorithms often give unsatisfactory results, especially for large-scale signed graphs. This paper tackles this problem and proposes a novel CC algorithm, termed star-based learning correlation clustering (SL-CC). The proposed SL-CC contains two phases. The first is a scale reduction for signed graphs. We propose a special motif, called a star structure, for reducing the scale of signed graphs. We assign the vertices within a star structure to have the same cluster label and then merge these vertices as a new vertex in the graph so we can shrink a large-scale graph to a much small-scale one. The second is to give a learning schema for the local search on the reduced graphs. It can discover some important stars as seeds of clusters according to the graph structure, and then justify whether the other stars need to be merged with seeds or not. We also construct a new integer linear programing (ILP) model based on cycle inequalities to perform the local search with final clustering results. The experiments and comparisons of the proposed SL-CC with some existing CC methods on synthetic and real data sets with variant scale structures of signed graphs demonstrate the efficiency and usefulness of the SL-CC algorithm.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.