An introduction to distributed training of deep neural networks for segmentation tasks with large seismic data sets

Claire Birnie,Fredrik Hansteen,Haithem Jarraya

doi:10.1190/geo2021-0130.1

Abstract

Deep learning applications are drastically progressing in seismic processing and interpretation tasks. However, most approaches subsample data volumes and restrict model sizes to minimize computational requirements. Subsampling the data risks losing vital spatiotemporal information which could aid training, whereas restricting model sizes can impact model performance, or in some extreme cases renders more complicated tasks such as segmentation impossible. We have determined how to tackle the two main issues of training of large neural networks (NNs): memory limitations and impracticably large training times. Typically, training data are preloaded into memory prior to training, a particular challenge for seismic applications in which the data format is typically four times larger than that used for standard image processing tasks (float32 versus uint8). Based on an example from microseismic monitoring, we evaluate how more than 750 GB of data can be used to train a model by using a data generator approach, which only stores in memory the data required for that training batch. Furthermore, efficient training over large models is illustrated through the training of a seven-layer U-Net with input data dimensions of [Formula: see text] (approximately [Formula: see text] million parameters). Through a batch-splitting distributed training approach, the training times are reduced by a factor of four. The combination of data generators and distributed training removes any necessity of data subsampling or restriction of NN sizes, offering the opportunity to use larger networks, higher resolution input data, or move from 2D to 3D problem spaces.

Full Text