In this study, visible-near-infrared (VIS-NIR) hyperspectral imaging was combined with a data fusion strategy for the nondestructive assessment of the starch content in intact potatoes. Spectral and textural data were extracted from hyperspectral images and transformed principal component (PC) images, respectively, and a partial least squares regression (PLSR) prediction model was then established. The results revealed that low-level data fusion could not improve accuracy in predicting starch content. Therefore, to improve prediction accuracy, key variables were selected from the spectral and textural data through competitive adaptive reweighted sampling (CARS) and correlation analysis, respectively, and mid-level data fusion was performed. With a residual predictive deviation (RPD) value > 2, the established PLSR model achieved satisfactory prediction accuracy. Therefore, this study demonstrated that appropriate data fusion can effectively improve the prediction accuracy for starch content and thus aid the sorting of potato starch content in the production line.