An Efficient MapReduce-Based Parallel Processing Framework for User-Based Collaborative Filtering

Hanjo Jeong,Kyung Jin Cha

doi:10.3390/sym11060748

Hanjo Jeong, Kyung Jin Cha

Open Access

https://doi.org/10.3390/sym11060748

Copy DOI

Journal: Symmetry	Publication Date: Jun 3, 2019
Citations: 8	License type: CC BY 4.0

Affiliation: Kwangwoon University, Kangwon National University

Abstract

User-based collaborative filtering is one of the most-used methods for the recommender systems. However, it takes time to perform the method because it requires a full scan of the entire data to find the neighboring users of each active user, who have similar rating patterns. It also requires time-consuming computations because of the complexity of the algorithms. Furthermore, the amount of rating data in the recommender systems grows rapidly, as the number of users, items, and their rating activities tend to increase. Thus, a big data framework with parallel processing, such as Hadoop, is needed for the recommender systems. There are already many research studies on the MapReduce-based parallel processing method for collaborative filtering. However, most of the research studies have not considered the sequential-access restriction for executing MapReduce jobs and the minimization of the required full scan on the entire data on the Hadoop Distributed File System (HDFS), because HDFS sequentially access data on the disk. In this paper, we introduce an efficient MapReduce-based parallel processing framework for collaborative filtering method that requires only a one-time parallelized full scan, while adhering to the sequential access patterns on Hadoop data nodes. Our proposed framework contains a novel MapReduce framework, including a partial computation framework for calculating the predictions and finding the recommended items for an active user with such a one-way parallelized scan. Lastly, we have used the MovieLens dataset to show the validity of our proposed method, mainly in terms of the efficiency of the parallelized method.

Highlights

Collaborative filtering is a method for recommender systems, which is a software system that provides more preferable data items to a user by predicting the user’s preference of data items that the user has not yet seen [1,2]
We introduce an efficient MapReduce-based parallel processing framework for collaborative filtering method that requires only a one-time parallelized full scan, while adhering to the sequential access patterns on Hadoop data nodes
As the goal of this research was to transform the user-based collaborative filtering method to a MapReduce-based parallel processing method, the accuracy of the proposed method in this research should be identical to the original collaborative filtering method

Summary

Introduction

Collaborative filtering is a method for recommender systems, which is a software system that provides more preferable data items to a user by predicting the user’s preference of data items that the user has not yet seen [1,2]. The recommender systems are used to maintain and manage the large amount of data, and to process and analyze the data in parallel. The Hadoop framework [5] is introduced to process and analyze such large data. Hadoop stores and manages large sets of data with the Hadoop framework, consists of the HDFS (Hadoop Distributed File System) [6] and MapReduce [7] framework. MapReduce is a parallel processing programing framework consisting of JobTracker and TaskTracker components, where JobTracker

Objectives

Methods

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An Efficient MapReduce-Based Parallel Processing Framework for User-Based Collaborative Filtering

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Symmetry

Lead the way for us

Similar Papers

Benchmarking big data recommendation algorithms using Hadoop orApache Spark
Dinesh Kumar Saini ... Arshad Muhammad
-
Dinesh Kumar Saini, et. al.Dinesh Kumar Saini ... Arshad Muhammad
04 Jul 2019
04 Jul 2019

Performability Comparison of Lustre and HDFS for MR Applications
Rekha Singhal ... Manoj Nambiar
-
Rekha Singhal, et. al.Rekha Singhal ... Manoj Nambiar
01 Nov 2014
01 Nov 2014

A Framework for Recommender Systems in E-Commerce Based on Distributed Storage and Data-Mining
Changjian Fu ... Zhihua Leng
-
Changjian Fu, et. al.Changjian Fu ... Zhihua Leng
01 May 2010
01 May 2010

A New HDFS Structure Model to Evaluate The Performance of Word Count Application on Different File Size
Md Kamal Uddin ... Mohammad Badrulalammiah
International Journal of Computer Applications | VOL. 111
Md Kamal Uddin, et. al.Md Kamal Uddin ... Mohammad Badrulalammiah
18 Feb 2015
International Journal of Computer Applications | VOL. 111

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Efficient MapReduce-Based Parallel Processing Framework for User-Based Collaborative Filtering

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Symmetry