Image-based visual hull rendering is a method for generating depth maps of a desired viewpoint from a set of silhouette images captured by calibrated cameras. It does not compute a view-independent data representation, such as a voxel grid or a mesh, which makes it particularly efficient for dynamic scenes. When users are captured, the scene is usually dynamic, but does not change rapidly because people move smoothly within a subsecond time frame. Exploiting this temporal coherence to avoid redundant calculations is challenging because of the lack of an explicit data representation. This paper analyzes the image-based visual hull algorithm to find intermediate information that stays valid over time and is, therefore, worth to make explicit. We then derive methods that exploit this information to improve the rendering performance. Our methods reduce the execution time by up to 25 percent. When the user's motions are very slow, reductions of up to 50 percent are achieved.