The study aimed to compare the diagnostic efficacy of the machine learning models with expert subjective assessment (SA) in assessing the malignancy risk of ovarian tumors using transvaginal ultrasound (TVUS). The retrospective single-center diagnostic study included 1555 consecutive patients from January 2019 to May 2021. Using this dataset, Residual Network(ResNet), Densely Connected Convolutional Network(DenseNet), Vision Transformer(ViT), and Swin Transformer models were established and evaluated separately or combined with Cancer antigen 125 (CA 125). The diagnostic performance was then compared with SA. Of the 1555 patients, 76.9% were benign, while 23.1% were malignant (including borderline). When differentiating the malignant from ovarian tumors, the SA had an AUC of 0.97 (95% CI, 0.93-0.99), sensitivity of 87.2%, and specificity of 98.4%. Except for Vision Transformer, other machine learning models had diagnostic performance comparable to that of the expert. The DenseNet model had an AUC of 0.91 (95% CI, 0.86-0.95), sensitivity of 84.6%, and specificity of 95.1%. The ResNet50 model had an AUC of 0.91 (0.85-0.95). The Swin Transformer model had an AUC of 0.92 (0.87-0.96), sensitivity of 87.2%, and specificity of 94.3%. There was a statistically significant difference between the Vision Transformer and SA, and between the Vision Transformer and Swin Transformer models (AUC: 0.87 vs. 0.97, P = 0.01; AUC: 0.87 vs. 0.92, P = 0.04). Adding CA125 did not improve the diagnostic performance of the models in distinguishing benign and malignant ovarian tumors. The deep learning model of TVUS can be used in ovarian cancer evaluation, and its diagnostic performance is comparable to that of expert assessment.