Abstract

Nonnegative Matrix Factorization (NMF) is a commonly used method in machine learning and data analysis for feature extraction and dimensionality reduction of nonnegative data. Recently, we observe its increasing popularity in processing massive data, and advances in developing various distributed algorithms for NMF. In the paper, we propose a computational strategy for implementation of the Hierarchical Alternating Least Squares (HALS) algorithm using the MapReduce programming paradigm. Due to this approach, the scalable HALS NMF, which can be implemented on parallel and distributed computer architectures, is obtained. The scalability and efficiency of the proposed algorithm is confirmed in the numerical experiments, performed on large-scale synthetic and recommendation system datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call