Forest stock volume (FSV) is a key indicator for measuring forest quality, evaluating forest management capabilities, and the main factor for evaluating forest carbon sequestration levels. In this study, to achieve an accurate estimation of FSV, we used Ninth Beijing Forest Inventory data (FID), and Landsat 8 OLI and Sentinel-2 MSI imagery to establish FSV models. The performance of Landsat 8 and Sentinel-2 imagery data in estimating forest volume in Huairou District, Beijing, China was compared. The combination of Landsat 8 and Sentinel-2 satellite data was employed to create a new data source. Two variable selection methods, linear stepwise regression (LSR) and recursive feature elimination (RFE), were used to extract feature variables. The multiple linear regression(MLR) models, Back Propagation (BP) neural network models, and Random Forest (RF) models were employed to estimate forest volume in the study area based on the feature variables obtained from both data sources. The research results indicate (1) the Sentinel-2-based model achieved higher accuracy compared to the same model based on the Landsat 8 factor set. The correlation between the red-edge band of Sentinel-2 imagery and FSV is more significant than that of other characteristic variables used. Variables derived from the red-edge band have the potential to reduce model errors; (2) the estimation accuracy of the model can be significantly improved by using the RFE (Recursive Feature Elimination) method to select remote sensing feature variables. RFE is based on the importance ranking of all feature variables and selects the feature variables that contribute the most to the model. In the variable group selected by RFE, the texture features and the derived features from the red-edge band, such as SenB5, SenRVI, SenmNDVIre, and SenB5Mean, contribute the most to the improvement of model accuracy. Furthermore, in the optimal Landsat 8–Sentinel-2 RFE-RF model, where texture features are involved, the rRMSE is greatly reduced by 3.7% compared to the joint remote sensing RFE-RF model without texture features; (3) the MLR, BP, and RF models based on the modeling factor set established on Sentinel-2 have accuracy superior to the model accuracy established based on the modeling factor set of Landsat 8. Among them, the Random Forest (RF) method inverted by the recursive feature elimination (RFE) method using Sentinel-2A image has the best inversion accuracy effect (R2 = 0.831, RMSE = 12.604 m3 ha−1, rRMSE = 36.411%, MAE = 9.366 m3 ha−1). Comparing the performance of the models on the test set, the ranking is as follows, Random Forest (RF) model > Back Propagation (BP) neural network model > multiple linear regression (MLR) model. The feature variable screening based on the Random Forest’s recursive feature elimination (RFE) method is better than the linear stepwise regression (LSR). Therefore, the RFE-RF method based on the joint variables from Landsat 8 and Sentinel-2 satellite data to establish a new remote sensing data source provides the possibility to improve the estimation accuracy of FSV and provides reference for forest dynamic monitoring.