AbstractWe present improvements over our previous approach to automatic winter hydrometeor classification by means of convolutional neural networks (CNNs), using more data and improved training techniques to achieve higher accuracy on a more complicated dataset than we had previously demonstrated. As an advancement of our previous proof-of-concept study, this work demonstrates broader usefulness of deep CNNs by using a substantially larger and more diverse dataset, which we make publicly available, from many more snow events. We describe the collection, processing, and sorting of this dataset of over 25,000 high-quality multiple-angle snowflake camera (MASC) image chips split nearly evenly between five geometric classes: aggregate, columnar crystal, planar crystal, graupel, and small particle. Raw images were collected over 32 snowfall events between November 2014 and May 2016 near Greeley, Colorado and were processed with an automated cropping and normalization algorithm to yield 224x224 pixel images containing possible hydrometeors. From the bulk set of over 8,400,000 extracted images, a smaller dataset of 14,793 images was sorted by image quality and recognizability (Q&R) using manual inspection. A presorting network trained on the Q&R dataset was applied to all 8,400,000+ images to automatically collect a subset of 283,351 good snowflake images. Roughly 5,000 representative examples were then collected from this subset manually for each of the five geometric classes. With a higher emphasis on in-class variety than our previous work, the final dataset yields trained networks that better capture the imperfect cases and diverse forms that occur within the broad categories studied to achieve an accuracy of 96.2% on a vastly more challenging dataset.
Read full abstract