Pattern Discovery in High Dimensional Binary Data

Peng Jiang,Michael T Heath

doi:10.1109/icdmw.2013.154

Abstract

High dimensional binary datasets arise in many areas of applications and pose significant challenges in data analysis. Pattern discovery is a key technique for analyzing these datasets. This paper presents algorithms for binary matrix factorization (BMF), which compresses large datasets into a much smaller set of dominant patterns for subsequent applications. BMF refers to the problem of finding two binary matrices of low rank such that the difference between their matrix product and a given binary matrix is minimal. One approximate matrix factor finds the dominant patterns, and the other shows how the original patterns are represented by the dominant ones. The problem of determining the exact optimal solution is NP-hard. We show that BMF is closely related with k-means clustering and propose a clustering approach for BMF. We prove that our approach has approximation ratio of 2. We further propose a randomized clustering algorithm that chooses k cluster centroids randomly based on preassigned probabilities to each point. The randomized clustering algorithm works well for large k. We experimentally demonstrate the nice theoretical properties of BMF on applications in pattern extraction and association rule mining.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Pattern Discovery in High Dimensional Binary Data

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Mining Discrete Patterns via Binary Matrix Factorization
Peng Jiang ... Michael T Heath
-
Peng Jiang, et. al.Peng Jiang ... Michael T Heath
01 Dec 2013
01 Dec 2013

Binary multi-modal matrix factorization for fast item cold-start recommendation
Chengmei Peng ... Lei Guo
Neurocomputing | VOL. 507
Chengmei Peng, et. al.Chengmei Peng ... Lei Guo
06 Aug 2022
Neurocomputing | VOL. 507

A Clustering Approach to Constrained Binary Matrix Factorization
Peng Jiang ... Rui Yang
-
Peng Jiang, et. al.Peng Jiang ... Rui Yang
01 Jan 2014
01 Jan 2014

Improved Local Search for Binary Matrix Factorization
Seyed Hamid Mirisaee ... Eric Gaussier
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 29
Seyed Hamid Mirisaee, et. al.Seyed Hamid Mirisaee ... Eric Gaussier
16 Feb 2015
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 29

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Pattern Discovery in High Dimensional Binary Data

Abstract

Talk to us

Similar Papers