Multi-sensor data has significant advantages over single sensor data due to the unique property of each sensor. We discover that multi-sensor collaboration can save bits and achieve stable compression performance according to quantization parameter (QP). In this paper, we propose a multi-sensor collaboration network for video compression based on wavelet decomposition, called MSCN. We introduce MSCN into 3D video coding based on color and depth sensors. The images acquired by a color sensor represent color and texture of the scene, while the images obtained by a depth sensor represent 3D geometric shape of the scene objects. Two sensor data are complementary, and color images help to reconstruct their corresponding depth images. First, we perform uniform sampling on the input depth video. Then, we compress the color and downsampled depth videos using 3D-HEVC codec. Finally, we reconstruct the depth video from the decoded color and depth videos by color guided depth super-resolution (SR). Experimental results show that MSCN achieves average BD-rate reductions of {−9.3%, −65.6%, −66.3%} and {−6.2%, −67.7%, and −69.2%} on 3D-HEVC test datasets for sampling factors 1, 2 and 4 in Random Access (RA) and All Intra (AI) configurations, respectively. Moreover, they verify that multi-sensor collaboration remarkably saves bits in video compression.