Nonlinear variations in the molecular and isotopic compositions of phases in complex geosystems greatly hinder the application of geochemical proxies. This study aims to disentangle the implicit nonlinear mathematical structures embedded in geochemical datasets, effectively disaggregating overlapping geological influences that drive the intricate variations in the geochemical signatures of phases. Employing a typical hybrid petroleum system as a case study, we utilize an unsupervised machine learning algorithm to visualize the effects of source disparities and distinct evolutionary processes, such as mixing, thermal maturation, biodegradation, and evaporative fractionation, on the molecular compositions among crude oils. We further investigate the regression relationship between molecular composition and bulk δ13C signal in petroleum. Our findings reveal that by decomposing the regression model to solely reflect a specific dominant influence, the model could provide a precise geological interpretation. Accordingly, we unravel the subtle variations and underlying mechanisms of carbon isotopic fractionation in petroleum substances from different origins under the impact of maturation. Our results underscore the substantial potential of strategically applied machine learning techniques in reconstructing the geochemical evolution of complex geosystems, advocating for their broader application.