Objective. Respiration introduces a constant source of irregular motion that poses a significant challenge for the precise irradiation of thoracic and abdominal cancers. Current real-time motion management strategies require dedicated systems that are not available in most radiotherapy centers. We sought to develop a system that estimates and visualises the impact of respiratory motion in 3D given the 2D images acquired on a standard linear accelerator. Approach. In this paper we introduce Voxelmap, a patient-specific deep learning framework that achieves 3D motion estimation and volumetric imaging using the data and resources available in standard clinical settings. Here we perform a simulation study of this framework using imaging data from two lung cancer patients. Main results. Using 2D images as input and 3D–3D Elastix registrations as ground-truth, Voxelmap was able to continuously predict 3D tumor motion with mean errors of 0.1 ± 0.5, −0.6 ± 0.8, and 0.0 ± 0.2 mm along the left–right, superior–inferior, and anterior–posterior axes respectively. Voxelmap also predicted 3D thoracoabdominal motion with mean errors of −0.1 ± 0.3, −0.1 ± 0.6, and −0.2 ± 0.2 mm respectively. Moreover, volumetric imaging was achieved with mean average error 0.0003, root-mean-squared error 0.0007, structural similarity 1.0 and peak-signal-to-noise ratio 65.8. Significance. The results of this study demonstrate the possibility of achieving 3D motion estimation and volumetric imaging during lung cancer treatments on a standard linear accelerator.