Volumetric reconstruction of magnetic resonance imaging (MRI) from sparse samples is desirable for 3D motion tracking and promises to improve magnetic resonance (MR)-guided radiation treatment precision. Data-driven sparse MRI reconstruction, however, requires large-scale training datasets for prior learning, which is time-consuming and challenging to acquire in clinical settings. To investigate volumetric reconstruction of MRI from sparse samples of two orthogonal slices aided by sparse priors of two static 3D MRI through implicit neural representation (NeRP) learning, in support of 3D motion tracking during MR-guided radiotherapy. A multi-layer perceptron network was trained to parameterize the NeRP model of a patient-specific MRI dataset, where the network takes 4D data coordinates of voxel locations and motion states as inputs and outputs corresponding voxel intensities. By first training the network to learn the NeRP of two static 3D MRI with different breathing motion states, prior information of patient breathing motion was embedded into network weights through optimization. The prior information was then augmented from two motion states to 31 motion states by querying the optimized network at interpolated and extrapolated motion state coordinates. Starting from the prior-augmented NeRP model as an initialization point, we further trained the network to fit sparse samples of two orthogonal MRI slices and the final volumetric reconstruction was obtained by querying the trained network at 3D spatial locations. We evaluated the proposed method using 5-min volumetric MRI time series with 340ms temporal resolution for seven abdominal patients with hepatocellular carcinoma, acquired using golden-angle radial MRI sequence and reconstructed through retrospective sorting. Two volumetric MRI with inhale and exhale states respectively were selected from the first 30s of the time series for prior embedding and augmentation. The remaining 4.5-min time series was used for volumetric reconstruction evaluation, where we retrospectively subsampled each MRI to two orthogonal slices and compared model-reconstructed images to ground truth images in terms of image quality and the capability of supporting 3D target motion tracking. Across the seven patients evaluated, the peak signal-to-noise-ratio between model-reconstructed and ground truth MR images was 38.02±2.60dB and the structure similarity index measure was 0.98±0.01. Throughout the 4.5-min time period, gross tumor volume (GTV) motion estimated by deforming a reference state MRI to model-reconstructed and ground truth MRI showed good consistency. The 95-percentile Hausdorff distance between GTV contours was 2.41±0.77mm, which is less than the voxel dimension. The mean GTV centroid position difference between ground truth and model estimation was less than 1mm in all three orthogonal directions. A prior-augmented NeRP model has been developed to reconstruct volumetric MRI from sparse samples of orthogonal cine slices. Only one exhale and one inhale 3D MRI were needed to train the model to learn prior information of patient breathing motion for sparse image reconstruction. The proposed model has the potential of supporting 3D motion tracking during MR-guided radiotherapy for improved treatment precision and promises a major simplification of the workflow by eliminating the need for large-scale training datasets.