Big data gathering and mining pipelines for CRM using open-source

Kang Li,Neeraj Pradhan,Vinay Deolalikar

doi:10.1109/bigdata.2015.7364128

Abstract

Customer Relationship Management (CRM) is currently the fastest growing sector of enterprise software, estimated to increase to $36.5B worldwide by 2017. CRM technologies increasingly use data mining primitives across multiple applications. At the same time, the growth of big data has led to the evolution of an open source big data software stack (primarily powered by Apache software) that rivals traditional enterprise database (RDBMS) stacks. New technologies such as Kafka, Storm, HBase have significantly enriched this open source stack, alongside more established technologies such as Hadoop MapReduce and Mahout. Today, enterprises have a choice to make regarding which stack they will choose to power their big data applications. However, there are no published studies in literature on enterprise big data pipelines built using open source components supporting CRM. Specific questions that enterprises have include: how is the data processed and analyzed in such pipelines? What are the building blocks of such pipelines? How long does each step of this processing take? In this work, we answer these questions for a large scale (serving over a 100M customers) industrial CRM pipeline that incorporates data mining, and serves several applications. Our pipeline has, broadly, two parts. The first is a data gathering part that uses Kafka, Storm, and HBase. The second is a data mining part that uses Mahout and Hadoop MapReduce. We also provide timings for common tasks in the second part such as data preprocessing for machine learning, clustering, reservoir sampling, and frequent itemset extraction.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Big data gathering and mining pipelines for CRM using open-source

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A Review of Big Data Analytics for Customer Relationship Management
W.K.R Perera ... K A Dilini
-
W.K.R Perera, et. al.W.K.R Perera ... K A Dilini
01 Dec 2018
01 Dec 2018

Artificial Intelligence, Blockchain, Big Data Analytics, Machine Learning and Data Mining in Traditional CRM and Social CRM: A Critical Review
Georgios Lampropoulos ... Olaf Reinhold
-
Georgios Lampropoulos, et. al.Georgios Lampropoulos ... Olaf Reinhold
01 Nov 2022
01 Nov 2022

Research trends analysis by comparing data mining and customer relationship management through bibliometric methodology
Hsu-Hao Tsai
Scientometrics | VOL. 87
Hsu-Hao TsaiHsu-Hao Tsai
18 Feb 2011
Scientometrics | VOL. 87

A structured literature review on Big Data for customer relationship management (CRM): toward a future agenda in international marketing
Pasquale Del Vecchio ... Evangelia Siachou
International Marketing Review | VOL. 39
Pasquale Del Vecchio, et. al.Pasquale Del Vecchio ... Evangelia Siachou
13 Dec 2021
International Marketing Review | VOL. 39

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Big data gathering and mining pipelines for CRM using open-source

Abstract

Talk to us

Similar Papers