Abstract As artificial intelligence (AI) methods are increasingly used to develop new guidance intended for operational use by forecasters, it is critical to evaluate whether forecasters deem the guidance trustworthy. Past trust-related AI research suggests that certain attributes (e.g., understanding how the AI was trained, interactivity, and performance) contribute to users perceiving the AI as trustworthy. However, little research has been done to examine the role of these and other attributes for weather forecasters. In this study, we conducted 16 online interviews with National Weather Service (NWS) forecasters to examine (i) how they make guidance use decisions and (ii) how the AI model technique used, training, input variables, performance, and developers as well as interacting with the model output influenced their assessments of trustworthiness of new guidance. The interviews pertained to either a random forest model predicting the probability of severe hail or a 2D convolutional neural network model predicting the probability of storm mode. When taken as a whole, our findings illustrate how forecasters’ assessment of AI guidance trustworthiness is a process that occurs over time rather than automatically or at first introduction. We recommend developers center end users when creating new AI guidance tools, making end users integral to their thinking and efforts. This approach is essential for the development of useful and used tools. The details of these findings can help AI developers understand how forecasters perceive AI guidance and inform AI development and refinement efforts. Significance Statement We used a mixed-methods quantitative and qualitative approach to understand how National Weather Service (NWS) forecasters 1) make guidance use decisions within their operational forecasting process and 2) assess the trustworthiness of prototype guidance developed using artificial intelligence (AI). When taken as a whole, our findings illustrate that forecasters’ assessment of AI guidance trustworthiness is a process that occurs over time rather than automatically and suggest that developers must center the end user when creating new AI guidance tools to ensure that the developed tools are useful and used.
Read full abstract