Abstract

In big data era, fresh data grows rapidly every day. More than 30,000 gigabytes of data are created every second and the rate is accelerating. Many organizations rely heavily on real time streaming, while big data stream computing helps them spot opportunities and risks from real time big data. Storm, one of the most common online stream computing platforms, has been used for big data stream computing, with response time ranging from milliseconds to sub-seconds. The performance of Storm plays a crucial role in different application scenarios, however, few studies were conducted to evaluate the performance of Storm. In this paper, we investigate the performance of Storm under different application scenarios. Our experimental results show that throughput and latency of Storm are greatly affected by the number of instances of each vertex in task topology, and the number of available resources in data center. The fault-tolerant mechanism of Storm works well in most big data stream computing environments. As a result, it is suggested that a dynamic topology, an elastic scheduling framework, and a memory based fault-tolerant mechanism are necessary for providing high throughput and low latency services on Storm platform.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.