RGBD cameras that capture color and depth information have a wide range of applications such as robotics, autonomous driving, and cons umer electronics. Traditional RGBD imaging usually requires multi-cameras or extra active illumination which inevitably leads to bulky or complex imaging systems, hindering the growing demand for compact integrated optical devices. Optical metasurface have emerged as a powerful substitute to the traditional diffractive optical element due to the superior dispersion manipulation and extremely compact size. However, it is still a great challenge to realize a single-shot monocular metasurface camera, due to the strong and unignorable wavelength-dependent aberrations. In this work, we demonstrate an end-to-end compact single-shot monocular metasurface camera for RGBD imaging. By utilizing end-to-end joint optimization framework to learn metasurface physical structure in conjunction with deep neural networks reconstruction algorithm, the multidimensional light field information RGBD of a scene in a single shot can be reconstructed within a depth range of 0.5 m. Compared with traditional lens-based RGBD imaging, our proposed metasurface-based imaging achieves an improvement of about 2 dB in chromatic imaging and a 4 times increase in depth estimation accuracy. Our proposed scheme could facilitate the further development of the intelligent computational meta-optics in diverse fields ranging from machine vision to biomedical imaging.