Efficient Algebraic Multigrid Methods for Multilevel Overlapping Coclustering of User-Item Relationships

Haifeng Xu,Rasha F Kashef,Geoffrey Sanders,Hans De Sterck

doi:10.1287/ijoc.2021.1137

Abstract

Various digital data sets that encode user-item relationships contain a multilevel overlapping cluster structure. The user-item relation can be encoded in a weighted bipartite graph and uncovering these overlapping coclusters of users and items at multiple levels in the bipartite graph can play an important role in analyzing user-item data in many applications. For example, for effective online marketing, such as placing online ads or deploying smart online marketing strategies, identifying co-occurring clusters of users and items can lead to accurately targeted advertisements and better marketing outcomes. In this paper, we propose fast algorithms inspired by algebraic multigrid methods for finding multilevel overlapping cocluster structures of feature matrices that encode user-item relations. Starting from the weighted bipartite graph structure of the feature matrix, the algorithms use agglomeration procedures to recursively coarsen the bipartite graphs that represent the relations between the coclusters on increasingly coarser levels. New fast coarsening routines are described that circumvent the bottleneck of all-to-all similarity computations by exploiting measures of direct connection strength between row and column variables in the feature matrix. Providing accurate coclusters at multiple levels in a manner that can scale to large data sets is a challenging task. In this paper, we propose heuristic algorithms that approximately and recursively minimize normalized cuts to obtain coclusters in the aggregated bipartite graphs on multiple levels of resolution. Whereas the main novelty and focus of the paper lies in algorithmic aspects of reducing computational complexity to obtain scalable methods specifically for large rectangular user-item matrices, the algorithmic variants also define several new models for determining multilevel coclusters that we justify intuitively by relating them to principles that underlie collaborative filtering methods for user-item relationships. Experimental results show that the proposed algorithms successfully uncover the multilevel overlapping cluster structure for artificial and real data sets. Summary of Contribution: This paper develops new and efficient computational methods for finding the multilevel overlapping cocluster structure of feature matrices that encode user-item relationships. We base our approach on the use of pairwise similarity measures between features, seeking clusters of points that are similar to each other and dissimilar from the points outside the cluster. We approximately solve the problem of finding optimal overlapping coclusters on multiple levels by employing a framework that is based on efficient multilevel methods that have been used previously to solve sparse linear systems and to cluster graphs. Our main contribution is that we extend these methods in efficient manners to find coclusters in the bipartite graphs that encode common and important user-item relationships or social network relations. The novel methods that we propose are inherently scalable to large problem sizes and are naturally able to uncover overlapping coclusters at multiple levels, whereas existing methods generally only find coclusters at the fine level. We illustrate the algorithm and its performance on some standard test problems from the literature and on a proof-of-concept real-world data set that relates LinkedIn users to their skills and expertise.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Efficient Algebraic Multigrid Methods for Multilevel Overlapping Coclustering of User-Item Relationships

Abstract

Talk to us

Similar Papers

More From: INFORMS Journal on Computing

Lead the way for us

Journal: INFORMS Journal on Computing	Publication Date: Jan 31, 2022
Citations: 1

Similar Papers

Characterization of peptide-protein relationships in protein ambiguity groups via bipartite graphs.
Karin Schork ... Jon M Jacobs
PloS one | VOL. 17
Karin Schork, et. al.Karin Schork ... Jon M Jacobs
21 Oct 2022
PloS one | VOL. 17

Multi-resolution techniques for visual exploration of large time-series data
...
-
, et. al. ...
23 May 2007
23 May 2007

Lifting Biorthogonal B-spline Wavelets
Martin Bertram
-
Martin BertramMartin Bertram
01 Jan 2004
01 Jan 2004

Clustering analysis of proteins from microbial genomes at multiple levels of resolution.
Leonid Zaslavsky ... Tatiana Tatusova
BMC Bioinformatics | VOL. Suppl 17 8
Leonid Zaslavsky, et. al.Leonid Zaslavsky ... Tatiana Tatusova
01 Aug 2016
BMC Bioinformatics | VOL. Suppl 17 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficient Algebraic Multigrid Methods for Multilevel Overlapping Coclustering of User-Item Relationships

Abstract

Talk to us

Similar Papers

More From: INFORMS Journal on Computing