Obstructive sleep apnea (OSA) is closely associated with the development and chronicity of temporomandibular disorder (TMD). Given the intricate pathophysiology of both OSA and TMD, comprehensive diagnostic approaches are crucial. This study aimed to develop an automatic prediction model utilizing multimodal data to diagnose OSA among TMD patients. We collected a range of multimodal data, including clinical characteristics, portable polysomnography, X-ray, and MRI data, from 55 TMD patients who reported sleep problems. This data was then analyzed using advanced machine learning techniques. Three-dimensional VGG16 and logistic regression models were used to identify significant predictors. Approximately 53% (29 out of 55) of TMD patients had OSA. Performance accuracy was evaluated using logistic regression, multilayer perceptron, and area under the curve (AUC) scores. OSA prediction accuracy in TMD patients was 80.00–91.43%. When MRI data were added to the algorithm, the AUC score increased to 1.00, indicating excellent capability. Only the obstructive apnea index was statistically significant in predicting OSA in TMD patients, with a threshold of 4.25 events/h. The learned features of the convolutional neural network were visualized as a heatmap using a gradient-weighted class activation mapping algorithm, revealing that it focuses on differential anatomical parameters depending on the absence or presence of OSA. In OSA-positive cases, the nasopharynx, oropharynx, uvula, larynx, epiglottis, and brain region were recognized, whereas in OSA-negative cases, the tongue, nose, nasal turbinate, and hyoid bone were recognized. Prediction accuracy and heat map analyses support the plausibility and usefulness of this artificial intelligence-based OSA diagnosis and prediction model in TMD patients, providing a deeper understanding of regions distinguishing between OSA and non-OSA.