Abstract BACKGROUND Metastatic dissemination occurring via the cerebrospinal fluid contributes to the poor prognosis in many medulloblastoma patients. It has high prognostic relevance, and its accurate detection is critical for adequate therapy stratification. However, cytological examination is a difficult, time-consuming task and frequently unreliable. We, therefore, propose a machine-learning pipeline to solve this diagnostic challenge. METHODS A dataset of 303 digitized images of cerebrospinal fluid preparations from 81 medulloblastoma patients was assembled. Based on these data, over 50000 digitized objects, including more than 5000 patches with tumor cells, were classified by experts into 13 clinically relevant diagnostic categories and used for training. The classification was performed using convolutional neural networks, trained with nested-cross validation and different transfer learning strategies. RESULTS The classification into the 13 diagnostic categories was feasible, with an overall accuracy of 75% on the test set and a tumor-specific F1 score of 89%. Most misclassifications occurred between morphologically similar cell types, mainly lymphocytes and activated lymphocytes, monocytes and activated monocytes, as well as tumor cells and atypical cells with borderline cytomorphology. Grouping these cytologically and/or biologically closely related groups increased the overall accuracy to 91%. Pretraining on published cytological datasets of cerebrospinal fluid and bone marrow further increased classification accuracy. CONCLUSIONS Deep learning can reliably detect tumor cells in digitized cytological images from cerebrospinal fluid preparations in medulloblastoma. Implementing explanation methods, an easy-to-use user interface and validation on additional cohorts will further strengthen the robustness of the presented approach.
Read full abstract