Abstract

Image segmentation with deep learning models has significantly improved the accuracy of the pixel-wise labeling of scientific imaging which is critical for many quantitative image analyses. This has been feasible through U-Net and related architecture convolutional neural network models. Although the adoption of these models has been widespread, their training data pool and hyperparameters have been mostly determined by educated guesses through trial and error. In this study, we present observations of how training data volume, data augmentation, and patch size affect deep learning performance within a limited data set. Here we study U-Net model training on four different samples of x-ray CT images of fiber-reinforced composites. Because the training process is not deterministic, we relied on seven-fold replication of each experimental condition to avoid under-sampling and observe model training variance. Unsurprisingly, we find greater training data volume strongly benefits individual models’ final accuracy and learning speed while depressing variance among replicates. Importantly, data augmentation has a profound benefit to model performance, especially in cases with a low abundance of ground truth, and we conclude that high coefficients of data augmentation should be used in scientific imaging semantic segmentation models. Future work to describe and measure image complexity is warranted and likely to ultimately guide researchers on the minimum required training data volume for particular scientific imaging deep learning tasks.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.