CEM500K, a large-scale heterogeneous unlabeled cellular electron microscopy image dataset for deep learning.

Ryan Conrad,Kedar Narayan

doi:10.7554/elife.65894

Abstract

Automated segmentation of cellular electron microscopy (EM) datasets remains a challenge. Supervised deep learning (DL) methods that rely on region-of-interest (ROI) annotations yield models that fail to generalize to unrelated datasets. Newer unsupervised DL algorithms require relevant pre-training images, however, pre-training on currently available EM datasets is computationally expensive and shows little value for unseen biological contexts, as these datasets are large and homogeneous. To address this issue, we present CEM500K, a nimble 25 GB dataset of 0.5 × 106 unique 2D cellular EM images curated from nearly 600 three-dimensional (3D) and 10,000 two-dimensional (2D) images from >100 unrelated imaging projects. We show that models pre-trained on CEM500K learn features that are biologically relevant and resilient to meaningful image augmentations. Critically, we evaluate transfer learning from these pre-trained models on six publicly available and one newly derived benchmark segmentation task and report state-of-the-art results on each. We release the CEM500K dataset, pre-trained models and curation pipeline for model building and further expansion by the EM community. Data and code are available at https://www.ebi.ac.uk/pdbe/emdb/empiar/entry/10592/ and https://git.io/JLLTz.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: eLife	Publication Date: Apr 8, 2021
Citations: 29	License type: CC0 1.0

R Discovery Prime

R Discovery Prime

CEM500K, a large-scale heterogeneous unlabeled cellular electron microscopy image dataset for deep learning.

Abstract

Talk to us

Similar Papers

More From: eLife

Lead the way for us

Similar Papers

CEM500K – A large-scale heterogeneous unlabeled cellular electron microscopy image dataset for deep learning.
Ryan Conrad ... Kedar Narayan
Microscopy and Microanalysis | VOL. 27
Ryan Conrad, et. al.Ryan Conrad ... Kedar Narayan
30 Jul 2021
Microscopy and Microanalysis | VOL. 27

Scalable and interactive segmentation and visualization of neural processes in EM datasets.
Won-Ki Jeong ... Hanspeter Pfister
IEEE transactions on visualization and computer graphics | VOL. 15
Won-Ki Jeong, et. al.Won-Ki Jeong ... Hanspeter Pfister
01 Nov 2009
IEEE transactions on visualization and computer graphics | VOL. 15

EM-stellar: benchmarking deep learning for electron microscopy image segmentation.
Afshin Khadangi ... Thomas Boudier
Bioinformatics (Oxford, England) | VOL. 37
Afshin Khadangi, et. al.Afshin Khadangi ... Thomas Boudier
08 Jan 2021
Bioinformatics (Oxford, England) | VOL. 37

Segmentation in large-scale cellular electron microscopy with deep learning: A literature survey.
Anusha Aswath ... Ben N.G Giepmans
Medical Image Analysis | VOL. 89
Anusha Aswath, et. al.Anusha Aswath ... Ben N.G Giepmans
01 Oct 2023
Medical Image Analysis | VOL. 89

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CEM500K, a large-scale heterogeneous unlabeled cellular electron microscopy image dataset for deep learning.

Abstract

Talk to us

Similar Papers

More From: eLife