CT-ORG, a new dataset for multiple organ segmentation in computed tomography

Blaine Rister,Darvin Yi,Kaushik Shivakumar,Daniel L Rubin,Tomomi Nobashi

doi:10.1038/s41597-020-00715-8

Blaine Rister, Darvin Yi + Show 3 more

Open Access

PDF Available

https://doi.org/10.1038/s41597-020-00715-8

Copy DOI

Export

Save

Cite

Journal: Scientific Data	Publication Date: Nov 11, 2020
Citations: 76	License type: open-access

Affiliation: Stanford University

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Despite the relative ease of locating organs in the human body, automated organ segmentation has been hindered by the scarcity of labeled training data. Due to the tedium of labeling organ boundaries, most datasets are limited to either a small number of cases or a single organ. Furthermore, many are restricted to specific imaging conditions unrepresentative of clinical practice. To address this need, we developed a diverse dataset of 140 CT scans containing six organ classes: liver, lungs, bladder, kidney, bones and brain. For the lungs and bones, we expedited annotation using unsupervised morphological segmentation algorithms, which were accelerated by 3D Fourier transforms. Demonstrating the utility of the data, we trained a deep neural network which requires only 4.3 s to simultaneously segment all the organs in a case. We also show how to efficiently augment the data to improve model generalization, providing a GPU library for doing so. We hope this dataset and code, available through TCIA, will be useful for training and evaluating organ segmentation models.

Highlights

Background & SummaryMachine learning has the potential to transform healthcare by automating those tasks which defy mathematical description
Fully-convolutional neural networks (FCNs) have far surpassed what was previously thought possible in semantic segmentation across all imaging domains[1,2]
We describe our experiments validating the suitability of our data for deep learning tasks

Summary

Background & Summary

Machine learning has the potential to transform healthcare by automating those tasks which defy mathematical description. With limited training data these models will overfit, failing to generalize to unseen examples This is important for medical images, which are often expensive and laborious to annotate, even when the task is intuitive. Gibson et al provide 90 labeled cases, comprised of data from the Beyond the Cranial Vault and Pancreas-CT datasets[4,6,8] These annotations are manually cropped around a specific region of the abdomen. The data exhibit a wide variety of imaging conditions collected from various medical centers, to ensure generalizability of the trained models To our knowledge, this is the largest publicly available multi-organ dataset. Our experiments show that the dataset suffices to train a deep neural network, which outperforms the morphological algorithms from which it was trained

Methods

Experimental Results

Method

Code availability