Constructing an accurate three-dimensional (3D) geological model is crucial for advancing our understanding of subsurface structures and their evolution, particularly in complex regions such as the South China Sea (SCS). This study introduces a novel approach that integrates multimodal deep learning with multipoint statistics (MPS) to develop a high-resolution 3D crustal P-wave velocity structure model of the SCS. Our method addresses the limitations of traditional algorithms in capturing non-stationary geological features and effectively incorporates heterogeneous data from multiple geophysical sources, including 44 wide-angle seismic crustal structure profiles obtained by ocean bottom seismometers (OBSs), gravity anomalies, magnetic anomalies, and topographic data. The proposed model is rigorously validated against existing methods such as Kriging interpolation and MPS alone, demonstrating superior performance in reconstructing both global and local spatial features of the crustal structure. The integration of diverse datasets significantly enhances the model’s accuracy, reducing errors and improving the alignment with known geological information. The resulting 3D model provides a detailed and reliable representation of the SCS crust, offering critical insights for studies on tectonic evolution, resource exploration, and geodynamic processes. This work highlights the potential of combining deep learning with geostatistical methods for geological modeling, providing a robust framework for future applications in geosciences. The flexibility of our approach also suggests its applicability to other regions and geological attributes, paving the way for more comprehensive and data-driven investigations of Earth’s subsurface.