A depth image provides partial geometric information of a 3D scene, namely the shapes of physical objects as observed from a particular viewpoint. This information is important when synthesizing images of different virtual camera viewpoints via depth-image-based rendering (DIBR). It has been shown that depth images can be efficiently coded using contour-adaptive codecs that preserve edge sharpness, resulting in visually pleasing DIBR-synthesized images. However, contours are typically losslessly coded as side information, which is expensive if the object shapes are complex. In this paper, we pursue a new paradigm in depth image coding for color-plus-depth representation of a 3D scene: in a pre-processing step, we pro-actively simplify object shapes in a depth and color image pair to reduce depth coding cost, at a penalty of a slight increase in synthesized view distortion. Specifically, we first mathematically derive a distortion upper-bound proxy for 3DSwIM—a quality metric tailored for DIBR-synthesized images. This proxy reduces inter-dependency among pixel rows in a block to ease optimization. We then approximate object contours via a dynamic programming algorithm to optimally tradeoff coding the cost of contours using arithmetic edge coding with our proposed view synthesis distortion proxy. We modify the depth and color images according to the approximated object contours in an inter-view consistent manner. These are then coded, respectively, using a contour-adaptive image codec based on graph Fourier transform for edge preservation and High Efficiency Video Coding (HEVC) intra. Experimental results show that by maintaining sharp but simplified object contours during contour-adaptive coding, for the same visual quality of DIBR-synthesized virtual views, our proposal can reduce depth image coding rate by up to 22% in 3DSwIM and 42% in peak signal-to-noise ratio compared with alternative coding strategies, such as HEVC intra.