The pressing demand to deploy software updates without stopping running programs has fostered much research on live update systems in the past decades. Prior solutions, however, either make strong assumptions on the nature of the update or require extensive and error-prone manual effort, factors which discourage the adoption of live update. This paper presents <i>Mutable Checkpoint-Restart</i> ( <i>MCR</i> ), a new live update solution for generic (multiprocess and multithreaded) server programs written in C. Compared to prior solutions, MCR can support arbitrary software updates and automate most of the common live update operations. The key idea is to allow the running version to safely reach a quiescent state and then allow the new version to restart as similarly to a fresh program initialization as possible, relying on existing code paths to automatically restore the old program threads and reinitialize a relevant portion of the program data structures. To transfer the remaining data structures, MCR relies on a combination of precise and conservative garbage collection techniques to trace all the global pointers and apply the required state transformations on the fly. Experimental results on popular server programs ( <i>Apache httpd</i> , <i>nginx</i> , <i>OpenSSH</i> and <i>vsftpd</i> ) confirm that our techniques can effectively automate problems previously deemed difficult at the cost of negligible performance overhead (2 percent on average) and moderate memory overhead (3.9 <inline-formula><tex-math notation="LaTeX">$\times$ </tex-math></inline-formula> on average, without optimizations).
Read full abstract