Traditional depth estimation methods typically exploit the effect of either the variations in internal parameters such as aperture and focus (as in depth from defocus), or variations in extrinsic parameters such as position and orientation of the camera (as in stereo). When operating off-the-shelf (OTS) cameras in a general setting, these parameters influence the depth of field (DOF) and field of view (FOV). While DOF mandates one to deal with defocus blur, a larger FOV necessitates camera motion during image acquisition. As a result, for unfettered operation of an OTS camera, it becomes inevitable to account for pixel motion as well as optical defocus blur in the captured images. We propose a depth estimation framework using calibrated images captured under general camera motion and lens parameter variations. Our formulation seeks to generalize the constrained areas of stereo and shape from defocus (SFD)/focus (SFF) by handling, in tandem, various effects such as focus variation, zoom, parallax and stereo occlusions, all under one roof. One of the associated challenges in such an unrestrained scenario is the problem of removing user-defined foreground occluders in the reference depth map and image (termed inpainting of depth and image). Inpainting is achieved by exploiting the cue from motion parallax to discover (in other images) the correspondence/color information missing in the reference image. Moreover, considering the fact that the observations could be differently blurred, it is important to ensure that the degree of defocus in the missing regions (in the reference image) is coherent with the local neighbours (defocus inpainting).