Abstract

Due to the extreme growth in information and data, there is a need for automatic solutions for processing the increased volume of structured and unstructured data. Recently, the big data processing systems such as Hadoop, Spark, Flink, and other platforms provide decision making and process automation in a reasonable time. However, there are performance degradations in parallel systems due to the occurrence of anomalous tasks. This occurred when a machine takes an extra-long time to finish execution of a task, which decreases the system throughput. To overcome this effect, a speculative execution framework is introduced, which allows late tasks to run in alternative nodes. This paper proposes a framework for speculative execution in big data processing system which is composed of three stages. The first stage handles the anomalous tasks, while the second stage provides a selection policy for backup nodes, and eventually, the third stage assesses the speculative execution efficiency. The proposed framework provides a blueprint for evolving real-world computing systems, and gives a comprehensive perception, which can be applied to characterize and design robust and reliable speculative execution mechanisms for big data systems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.