BackgroundDistinguishing between middle ear cholesteatoma and chronic suppurative otitis media (CSOM) is an ongoing challenge. While temporal bone computed tomography (CT) scan is highly accurate for diagnosing middle ear conditions, its specificity in discerning between cholesteatoma and CSOM is only moderate. To address this issue, we utilized trained machine learning models to enhance the specificity of temporal bone CT scan in diagnosing middle ear cholesteatoma. Our database consisted of temporal bone CT scan native images from 122 patients diagnosed with middle ear cholesteatoma and a control group of 115 patients diagnosed with CSOM, with both groups labeled based on surgical findings. We preprocessed the native images to isolate the region of interest and then utilized the Inception V3 convolutional neural network for image embedding into data vectors. Classification was performed using machine learning models including support vector machine (SVM), k-nearest neighbors (k-NN), random forest, and neural network. Statistical metrics employed to interpret the results included classification accuracy, precision, recall, F1 score, confusion matrix, area under the receiver operating characteristic curve (AUC), and FreeViz diagram.ResultsOur training dataset comprised 5390 images, and the testing dataset included 125 different images. The neural network, k-NN, and SVM models demonstrated significantly higher relevance in terms of classification accuracy, precision, and recall compared to the random forest model. For instance, the F1 scores were 0.974, 0.987, and 0.897, respectively, for the former three models, in contrast to 0.661 for the random forest model.ConclusionThe performance metrics of the presented trained machine learning models hold promising prospects as potentially clinically useful aids.