Abstract Deep saline aquifers have strong heterogeneity under natural conditions, which affects the migration of carbon dioxide (CO2) injection into the reservoir. How to characterize the heterogeneity of rock mass is of great significance to research the CO2 migration law during CO2 storage. A method is proposed to construct different heterogeneous models from the point of view of whether the amount of data is sufficient or not, the wholly heterogeneous model with sufficient data, the deterministic multifacies heterogeneous model which is simplified by lithofacies classification, and the random multifacies heterogeneous model which is derived from known formation based on transfer probability theory are established, respectively. Numerical simulation is carried out to study the migration law of CO2 injected into the above three heterogeneous models. The results show that the migration of CO2 in heterogeneous deep saline aquifers shows a significant fingering flow phenomenon and reflect the physical process in CO2 storage; the migration law of CO2 in the deterministic multifacies heterogeneous model is similar to that in the wholly heterogeneous model and indicates that the numerical simulation of simplifying the wholly heterogeneous structure to the lithofacies classification structure is suitable for simulating the CO2 storage process. The random multifacies heterogeneous model based on the transfer probability theory accords with the development law of sedimentary formation and can be used to evaluate the CO2 migration law in unknown heterogeneous formations. On the other hand, by comparing the dry-out effect of CO2 in different heterogeneous models, it is pointed out that the multifacies characterization method will weaken the influence due to the local homogenization of the model in small-scale research; it is necessary to refine the grid and subdivide the lithofacies of the local key area elements to eliminate the research error. The research results provide feasible references and suggestions for the heterogeneous modeling of the missing data area and the simplification of large-scale heterogeneous models.