Abstract

One of the major challenges in performing incremental computations on parallel distributed stream processing systems is in the implementation of a mechanism for passing state values across successive runs. One approach is to enhance the granularity from record-at-a-time processing to processing at micro-batch level. A contrasting approach is to follow the record-at-a-time semantics and ensure scalability by means of distributed state management. Both approaches, however, require observing high degree of fault tolerance. In this paper, we study the problem of process state management against non-terminating data stream workloads for low-latency computing using the micro-batch stream processing approach. We attempt to examine methods that could yield optimum levels of state retentions with high degree of fault tolerance for typical processing workloads and propose a three-pronged approach to harness the demand.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call