Covariance of the orbital state of a resident space object (RSO) is a necessary requirement for various space situational awareness tasks, like the space collision warning. It describes an accuracy envelope of the RSO's location. However, in current space surveillance, the tracking data of an individual RSO is often found insufficiently accurate and sparsely distributed, making the predicted covariance (PC) derived from the tracking data and classical orbit dynamic system usually unrealistic in describing the error characterization of orbit predictions. Given the fact that the tracking data of an RSO from a single station or a fixed network share a similar temporal and spatial distribution, the evolution of PC could share a hidden relationship with that data distribution. This study proposes a novel method to generate accurate PC by combining the classical covariance propagation method and the data-driven approach. Two popular machine learning algorithms are applied to model the inconsistency between the orbit prediction error and the PC from historical observations, and then this inconsistency model is used for the future PC. Experimental results with the Swarm constellation satellites demonstrate that the trained Random Forest models can capture more than 95% of the underlying inconsistency in a tracking scenario of sparse observations. More importantly, the trained models show great generalization capability in correcting the PC of future epochs and other RSOs with similar orbit characteristics and observation conditions. Besides, a deep analysis of generalization performance is carried out to describe the temporal and spatial similarities of two data sets, in which the Jaccard similarity is used. It demonstrates that the higher the Jaccard similarity is, the better the generalization performance will be, which may be used as a guide to whether to apply the trained models of a satellite to other satellites. Further, the generalization performance is also evaluated by the classical Cramer von Misses test, which also shows that trained models have encouraging generalization performance.