Abstract

We study big-data hybrid-data-processing lambda architecture, which consolidates low-latency real-time frameworks with high-throughput Hadoop-batch frameworks over a massively distributed setup. In particular, real-time and batch-processing engines act as autonomous multi-agent systems in collaboration. We propose a Multi-Agent Lambda Architecture (MALA) for e-commerce data analytics. We address the high-latency problem of Hadoop MapReduce jobs by simultaneous processing at the speed layer to the requests which require a quick turnaround time. At the same time, the batch layer in parallel provides comprehensive coverage of data by intelligent blending of stream and historical data through the weighted voting method. The cold-start problem of streaming services is addressed through the initial offset from historical batch data. Challenges of high-velocity data ingestion is resolved with distributed message queues. A proposed multi-agent decision-maker component is placed at the MALA stack as the gateway of the data pipeline. We prove efficiency of our batch model by implementing an array of features for an e-commerce site. The novelty of the model and its key significance is a scheme for multi-agent interaction between batch and real-time agents to produce deeper insights at low latency and at significantly lower costs. Hence, the proposed system is highly appealing for applications involving big data and caters to high-velocity streaming ingestion and a massive data pool.

Highlights

  • Big-data Lambda Architecture (LA) attempts to balance high-throughput MapReduce frameworks with low-latency real-time processing

  • We introduce novel concepts of Multi-Agent Lambda Architecture (MALA), bringing unprecedented optimization through blending low-latency stream processing with comprehensive batch processing, which fits the analytical challenges of the e-commerce domain

  • MALA offers the following key advantages over standalone batch and stream frameworks: Smart Collaboration: Interactive real-time and batch agents can collaborate, retrain, consolidate, and exchange learning from individual domains to provide deeper analytical insights

Read more

Summary

Introduction

Big-data Lambda Architecture (LA) attempts to balance high-throughput MapReduce frameworks with low-latency real-time processing. Apache Hadoop is the de facto standard batch-processing system used to provide high-throughput, comprehensive and more accurate views of historical data. It suffers from several challenges such as high latency, larger storage, and bigger cluster requirements. We introduce novel concepts of Multi-Agent Lambda Architecture (MALA), bringing unprecedented optimization through blending low-latency stream processing with comprehensive batch processing, which fits the analytical challenges of the e-commerce domain. The model analyzes the sensor stream data for anomalies at the speed layer, which requires prompt actions. A maintenance application predicts the failure to analyze data in detail, which does not need prompt actions

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call