BackgroundPulmonary embolism (PE) is life-threatening and requires timely and accurate diagnosis, yet current imaging methods, like computed tomography pulmonary angiography, present limitations, particularly for patients with contraindications to iodinated contrast agents. We aimed to develop a quantitative texture analysis pipeline using machine learning (ML) based on non-contrast thoracic computed tomography (CT) scans to discover intensity and textural features correlated with regional lung perfusion (Q) physiology and pathology and synthesize voxel-wise Q surrogates to assist in PE diagnosis.MethodsWe retrospectively collected 99mTc-labeled macroaggregated albumin Q-SPECT/CT scans from patients suspected of PE, including an internal dataset of 76 patients (64 for training, 12 for testing) and an external testing dataset of 49 patients. Quantitative CT features were extracted from segmented lung subregions and underwent a two-stage feature selection pipeline. The prior-knowledge-driven preselection stage screened for robust and non-redundant perfusion-correlated features, while the data-driven selection stage further filtered features by fitting ML models for classification. The final classification model, trained with the highest-performing PE-associated feature combination, was evaluated in the testing cohorts based on the Area Under the Curve (AUC) for subregion-level predictability. The voxel-wise Q surrogate was then synthesized using the final selected feature maps (FMs) and model score maps (MSMs) to investigate spatial distributions. The Spearman correlation coefficient (SCC) and Dice similarity coefficient (DSC) were used to assess the spatial consistency between FMs or MSMs and Q-SPECT scans.ResultsThe optimal model performance achieved an AUC of 0.863 during internal testing and 0.828 on the external testing cohort. The model identified a combination containing 14 intensity and textural features that were non-redundant, robust, and capable of distinguishing between high- and low-functional lung regions. Spatial consistency assessment in the internal testing cohort showed moderate-to-high agreement between MSMs and reference Q-SPECT scans, with median SCC of 0.66, median DSCs of 0.86 and 0.64 for high- and low-functional regions, respectively.ConclusionsThis study validated the feasibility of using quantitative texture analysis and a data-driven ML pipeline to generate voxel-wise lung perfusion surrogates, providing a radiation-free, widely accessible alternative to functional lung imaging in managing pulmonary vascular diseases.Clinical trial numberNot applicable.