This study aims to identify high-emission vehicles in urban traffic management. Existing high-emission vehicle identification models typically overlook the dispersed anomaly data distribution, and machine learning anomaly detection methods face difficulties in learning decision boundaries, leading to reduced detection performance. To address these issues, this study proposes a semi-supervised learning method based on data fusion and deep anomaly detection. This method integrates the chassis and engine dynamometer test data with on-road remote sensing system data to obtain more comprehensive vehicle emission information. Experimental results demonstrate that the proposed method introduces a penalty mechanism for anomaly samples, encouraging the model to increase the dissimilarity in similarity between normal and abnormal data at the latent data distribution level. In vehicle emission datasets from different regions, this method achieves over 95% AUC, demonstrating strong applicability and accuracy.