Abstract

In multicore systems, a large portion of checkpoint time overhead can be hidden from the execution critical path by resorting to a dedicated checkpointing thread run concurrently with regular execution threads for compressing checkpoint files to lower checkpointing overhead. On the other hand, the restore time is on the critical path that cannot be hidden, making it most important to accelerate execution restore upon failures. This work pursues a restore-express (REX) strategy for multi-level checkpointing (MLC), applicable to any incremental checkpointing (IC). Oblivious to application codes, REX employs adaptive IC (AIC) for local (L1) checkpointing and follows our runtime control for second-level (L2) checkpointing, with its aim at express restore from failures while holding down the overall execution time. It takes advantage of two unique insights for overhead reduction: (1) the modified pages of an incremental checkpoint file are likely to exist in a subsequent checkpoint file, and (2) many data patterns (on an average, some 40 percent of them) stay unchanged from one L2 checkpoint file to the next. These insights enable REX to (1) coalesce IC files (by involving only the last copy of every dirty page among files) and (2) boost file compression across multiple L2 checkpoints. Time and storage overhead results of REX during normal job execution are gathered for 16 benchmarks from SPEC, PARSEC, and NPB suites. The evaluation outcomes of the execution restore time confirm that REX is fast and able to quicken restore by a factor of 4.5× when compared with its IC counterpart (without utilizing the unique insights), while incurring same execution time overhead.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.