Traditional clustering algorithms aim to find a single clustering of data. However, it is difficult to put an accurate interpretation on the complex data and there will be multiple different meaningful explanations. For such situation, this paper presents a novel alternative clustering algorithm, which takes existing reference clusterings as side information and incorporates such information into the multivariate Information Bottleneck (IB) method. The side information is used to lead the learning algorithm to generate an alternative clustering that is different from the existing reference clusterings, while the multivariate IB method guarantees the quality of new clustering results. Our method has the ability to incorporate multiple existing reference clusterings into the alternative cluster learning process, and can be used to analyze both co-occurrence data and non co-occurrence data. Moreover, our method is able to discover non-linear alternative clusterings. The experimental results on synthetic and real-world datasets demonstrate that the performance of the proposed algorithm is superior to the existing state-of-the-art alternative clustering algorithms.
Read full abstract