Abstract While convective storm mode is explicitly depicted in convection-allowing model (CAM) output, subjectively diagnosing mode in large volumes of CAM forecasts can be burdensome. In this work, four machine learning (ML) models were trained to probabilistically classify CAM storms into one of three modes: supercells, quasi-linear convective systems, and disorganized convection. The four ML models included a dense neural network (DNN), logistic regression (LR), a convolutional neural network (CNN), and semisupervised CNN–Gaussian mixture model (GMM). The DNN, CNN, and LR were trained with a set of hand-labeled CAM storms, while the semisupervised GMM used updraft helicity and storm size to generate clusters, which were then hand labeled. When evaluated using storms withheld from training, the four classifiers had similar ability to discriminate between modes, but the GMM had worse calibration. The DNN and LR had similar objective performance to the CNN, suggesting that CNN-based methods may not be needed for mode classification tasks. The mode classifications from all four classifiers successfully approximated the known climatology of modes in the United States, including a maximum in supercell occurrence in the U.S. Central Plains. Further, the modes also occurred in environments recognized to support the three different storm morphologies. Finally, storm mode provided useful information about hazard type, e.g., storm reports were most likely with supercells, further supporting the efficacy of the classifiers. Future applications, including the use of objective CAM mode classifications as a novel predictor in ML systems, could potentially lead to improved forecasts of convective hazards. Significance Statement Whether a thunderstorm produces hazards such as tornadoes, hail, or intense wind gusts is in part determined by whether the storm takes the form of a single cell or a line. Numerical forecasting models can now provide forecasts that depict this structure. We tested several automated algorithms to extract this information from forecast output using machine learning. All of the automated methods were able to distinguish between a set of three convective types, with the simple techniques providing similarly skilled classifications compared to the complex approaches. The automated classifications also successfully discriminated between thunderstorm hazards, potentially leading to new forecast tools and better forecasts of high-impact convective hazards.
Read full abstract