U.S. National Weather Service (NWS) forecasters assess and communicate hazardous weather risks, including the likelihood of a threat and its impacts. Convection-allowing model (CAM) ensembles offer potential to aid forecasting by depicting atmospheric outcomes, including associated uncertainties, at the refined space and time scales at which hazardous weather often occurs. Little is known, however, about what CAM ensemble information is needed to inform forecasting decisions. To address this knowledge gap, participant observations and semistructured interviews were conducted with NWS forecasters from national centers and local weather forecast offices. Data were collected about forecasters’ roles and their forecasting processes, uses of model guidance and verification information, interpretations of prototype CAM ensemble products, and needs for information from CAM ensembles. Results revealed forecasters’ needs for specific types of CAM ensemble guidance, including a product that combines deterministic and probabilistic output from the ensemble as well as a product that provides map-based guidance about timing of hazardous weather threats. Forecasters also expressed a general need for guidance to help them provide impact-based decision support services. Finally, forecasters conveyed needs for objective model verification information to augment their subjective assessments and for training about using CAM ensemble guidance for operational forecasting. The research was conducted as part of an interdisciplinary research effort that integrated elicitation of forecasters’ CAM ensemble needs with model development efforts, with the aim of illustrating a robust approach for creating information for forecasters that is truly useful and usable.
When there is a risk of hazardous weather, U.S. National Weather Service (NWS) forecasters characterize and communicate the potential threat and its impacts with the fundamental goal to reduce harm. Forecasters draw on their meteorological expertise and on-the-job experiences, assess available observations, and interpret deterministic and ensemble numerical weather prediction (NWP) guidance (Murphy and Winkler 1971a,b; Roebber and Bosart 1996a,b; Bosart 2003; Doswell 2004; Roebber et al. 2004; Morss and Ralph 2007; Novak et al. 2008). NWP guidance available to forecasters has evolved tremendously over the last several decades due to advances in computing capabilities, understanding of meteorological processes, observational datasets, data assimilation, model parameterizations, and postprocessing techniques (see Benjamin et al. 2019 for a review). One major development has been convection-allowing models (CAMs), which resolve finescale spatial and temporal processes, including more accurate depictions of convection and its evolution (Weisman et al. 2008) and of orographically influenced processes (Mass et al. 2002; Schwartz 2014; Gowan et al. 2018). The first CAM became operational in the United Kingdom in 2002, and operational implementation expanded in the subsequent years, including to the United States in 2007 (Benjamin et al. 2019, see their Tables 13–7 and 13–8). The years since then have seen development of CAM ensembles, which explicitly characterize uncertainty of weather hazards, and which offer potential to help forecasters assess and communicate weather risks (Roebber et al. 2004; Novak et al. 2008; Kain et al. 2013; Stensrud et al. 2013; Rothfusz et al. 2018; Benjamin et al. 2019).
Translating CAM ensemble output into useful and usable information is especially important given new emphasis for NWS forecasters to provide impact-based decision support services (IDSS). In this role, forecasters “connect forecasts and warnings to decisions made,” and they “emphasize expert interpretation, consultation, and communication of forecasts and their impacts” (NWS 2019, p. 7). These forecaster responsibilities focus particularly on supporting NWS “core partners,” which include members of the emergency management communities, water resources communities, other government partners, and electronic media (NWS 2018a). Forecasters’ use of cutting-edge science and technology to help their users make better decisions is not a new concept (Stuart et al. 2006; Novak et al. 2008). However, CAM ensemble development, the capabilities it offers, and NWS’s focus on IDSS creates a new context for operations-to-research and research-to-operations (O2R/R2O) efforts (Jirak et al. 2010; Kain et al. 2013; Evans et al. 2014; Sobash et al. 2016; Gallo et al. 2016; Clark 2017; Greybush et al. 2017; Schwartz et al. 2019; Wilson et al. 2019).
With this new context comes the new—or perhaps more accurately, renewed—need to develop and provide model guidance that NWS forecasters can readily use to help them characterize and convey hazardous weather threats and impacts, including associated uncertainties. As Roebber et al. (2004) explained, “where high resolution model data are available, it is critical that resources be devoted to improving the use of the information rather than simply increasing the supply. The output from such models must be tailored to the needs of the forecasters” (p. 941, emphasis in original). These user-oriented ideas reflect the tenets of risk communication research wherein iterative dialogue with users to understand their decision space—including their goals, values, barriers, needs, experiences, and other factors—is essential for developing information that is useful to them (NRC 1989; Fischhoff 1995; Árvai 2014; Árvai and Campbell-Árvai 2014). This approach recognizes that users’ decision-making context is complex and that risk information is a factor, not the only factor, in managing risk. Risk information that considers this multifaceted decision context can then be developed accordingly.
This risk communication approach underpins the goal of the social science research presented here, which is to understand NWS forecasters’ IDSS-focused decision contexts in order to identify their needs for new and improved CAM ensemble information. Our research builds on a foundation of past work that has investigated public- and private-sector operational meteorologists’ forecast processes, information use and interpretations, and needs (Murphy and Winkler 1971a,b; Stewart et al. 1997; Doswell 2004, Homar et al. 2006; Morss and Ralph 2007; Novak et al. 2008; Demeritt et al. 2010; Daipha 2012, 2015; Evans et al. 2014; Wilson et al. 2019).
For our research, we collected in-depth, qualitative data from NWS forecasters at national forecast centers and local weather forecast offices (WFOs) using two methods: participant observations and semistructured interviews. The data collection focused on forecasters’ processes and decisions, including observations and model guidance used for forecasting and communication with partners. In addition, we developed prototype plots of different CAM ensemble output to represent, hypothetically and conceptually, the kinds of information that could be derived, and we elicited forecasters’ feedback on them. The guiding research questions of the work presented here are as follows:
What are NWS forecasters’ key forecast challenges and information needs?
What CAM and CAM ensemble guidance do NWS forecasters interrogate for different hazardous weather types and scenarios? How do they interpret and use the different guidance?
What CAM ensemble guidance do NWS forecasters want for assessing and communicating different hazardous weather types and scenarios, particularly based on their partners’ needs?
In what ways do NWS forecasters think about the skill of CAM ensemble guidance?
How can the knowledge gained by investigating the above research questions inform development of CAM ensemble information?
The path of O2R and R2O in the United States is iterative and involves multiple steps that evolve from initial foundational work to early conceptual prototyping to experimental testing to full deployment and operationalization (NOAA 2017). At the time of the interviews, no CAM ensemble guidance was available through NWS forecasters’ Advanced Weather Interactive Processing System (AWIPS) workstations,1 which is an NWS requirement for it to be deemed operational. Experimental CAM ensemble guidance was available through web-based platforms from different U.S. research laboratories and universities. R2O evaluations of products, which include different CAM ensemble output, often are done with forecasters at the experimental phase through NWS testbeds (Barthold et al. 2015; Gallo et al. 2016, 2017; Clark et al. 2012; Wilson et al. 2019). The research reported here represents an earlier part of the O2R/R2O path, where foundational work with forecasters was conducted iteratively alongside CAM ensemble model development efforts, including early conceptual prototyping of model output.
CAM ensemble development, capabilities, and use are multifaceted topics that involve interconnected issues ranging from ensemble system design, calibration, and limitations (Roebber et al. 2004; Benjamin et al. 2019) to philosophies about the forecaster’s role in an increasingly automated environment (Snellman 1977; Bosart 2003; Stuart et al. 2006, 2007; Novak et al. 2014; Henderson 2019). The research conducted here was designed to be agnostic to the CAM ensemble prediction system, so that the results could be applied to the NWS’s future operational system. Thus, this paper does not address the design or merits of one CAM ensemble system versus another, nor does it advocate what the role of a NWS forecaster ought to be with respect to model output as the “final forecast” versus “as guidance” (Novak et al. 2008, p. 1079, emphasis in original). Furthermore, this paper does not advocate the use of CAM ensembles in preference to other available information or in a given forecast situation. Rather, the purpose of this research is to recognize that CAM ensemble information is being developed for operational forecast use and to help guide development of information that has the potential to be most useful based on forecasters’ perspectives. The results synthesize forecasters’ feedback on and needs for types of CAM ensemble products (section 3), for information relevant to their IDSS roles (section 4), and for model verification and training to make use of the guidance (section 5). The needs that emerge are not all immediately viable for operational implementation, yet establishing them can be useful for guiding future R2O efforts.
a. Research design and data collection
The multimethod social science research approach employed here reflects the iterative nature of the project. In the first year of the project, data with forecasters were collected through a qualitative research method termed “participant observations.” The lead author was in the forecast environment to unobtrusively watch the forecast process—including information interrogated, communications, products issued, and so forth—and to occasionally ask follow-up questions about what was observed (Cresswell 2013; Merriam and Tisdell 2016). Seven randomly chosen days of observation were conducted of forecast operations at two NWS national forecast centers. In addition, 10 days of observation of the forecast process were conducted during two NWS Hydrometeorology Testbed experiments during which experimental ensemble guidance from coarse- and convection-allowing models was used and evaluated (NOAA 2020). The testbed observations were conducted across three weeks of the testbed experiments, which are held when the weather phenomena of interest climatologically occur. The participant observations were conducted between October 2015 and July 2016, and more than 20 forecasters were observed. Real-time and reflexive field notes were taken, and some observation periods were audio recorded for later reference.
The participant observations provided knowledge about the existing and experimental model guidance that forecasters examine and use across different forecast scenarios. This knowledge then guided semistructured interviews with WFO forecasters, which were conducted in the second year of the project.
Semistructured interviews include a set of open-ended questions that serve as a guide to elicit information, but they offer the interviewer flexibility to ask follow-up questions to delve deeper into a topic (Cresswell 2013; Merriam and Tisdell 2016). The interview guide, which simply is “a list of questions that you intend to ask in an interview” (Merriam and Tisdell 2016, p. 124), was developed collaboratively among the research team based on what was learned from the participant observations and based on ideas about CAM ensemble information that could be developed.
Forecasters first were asked in the interview to provide background about their job position and core duties. Then, they were asked to select from among severe weather,2 winter weather, or heavy rainfall and flash flooding as a hazardous weather focus for the interview, and they were asked to think about forecasting in the short term, from 0 to 24 h out. With these weather and timeframe foci, forecasters were asked questions about 1) their forecast processes, including a synthesis of observational and model data used; 2) details of coarse-scale ensemble, convection-allowing deterministic, and experimental CAM ensemble products accessed and used; 3) their ideas and needs for CAM ensemble guidance, including different parameters, thresholds, verification, and other information about the model or output, for different types of weather scenarios; 4) their interpretations of, feedback on, and potential use of six prototype CAM ensemble products, discussed in the next paragraph; and 5) their processes of and needs for communicating hazardous weather information with their partners. The interview guide is available from the authors upon request.
Prototype products were utilized in the interview in order to test ideas about information that could be derived from a CAM ensemble. Our aim with the prototypes was not that these versions would become operationalized and used by forecasters, but rather that they were initial product ideas to meet forecasters’ needs that could be further developed if found to be potentially useful. Because the research presented here was early in the O2R/R2O path (see introduction), the prototypes were model agnostic and were mock creations. Thus, the prototypes were not applicable to forecast operations on the day of the interview.
Six types of products, along with short descriptions of each, were developed by the NOAA/Earth System Research Laboratory (ESRL) research team and provided during the interviews. The products were map-based plots of the following:
point probabilities of exceeding a threshold,
neighborhood probabilities of exceeding a threshold in a 40-km radius,
paintballs (or paint splats) of where ensemble members exceed a threshold,
a “combination” of the ensemble control run and the 10th, 50th, and 90th percentile neighborhood probability contours,
mean onset time, showing the ensemble mean hour of the day when a threshold is first exceeded at a point, and
mean duration time, showing the ensemble mean number of hours that a threshold is exceeded at a point.
The point, neighborhood, and paintball prototypes emulated contemporary ways of portraying probabilistic ensemble information. The combination, onset, and duration prototypes were developed based on forecasters’ informational needs that emerged during the participant observations and are further discussed in section 3. The products were created for each of the heavy rainfall, winter, and severe weather scenarios to show rainfall amounts and rates, snowfall amounts and rates, and updraft helicity, respectively. No mean duration plot was generated for the severe weather case because it is uncommon for large magnitudes of updraft helicity to be sustained at a given point for more than one hour. Example sets of the prototype plots are shown in Figs. 1–3 for the heavy rainfall, winter, and severe weather scenarios, respectively. Figure 1 also includes the short descriptions that accompanied the prototype plots when they were shown to the forecaster. The same descriptions were provided for each scenario. To allow the interviewer to show forecasters prototypes that were relevant to their forecast area, the prototype products were generated for four different regions of the country: the Southeastern United States (depicted in Fig. 1), the Northeastern United States (depicted in Fig. 2), the Ohio Valley (depicted in Fig. 3), and the southwestern states of California, Nevada, Arizona and Utah (not shown).
The prototype plots were shown to the forecaster after they had already answered interview questions pertaining to topics 1–3 described above. Paper copies of the prototype plots along with their associated descriptions were shown to the forecaster in the order presented in Figs. 1–3. For each prototype, forecasters were asked to discuss their interpretation of the information, their preferences for additional or different information (e.g., thresholds, fields), and their potential use of such information if it were operationally available.
The semistructured interviews were conducted with 31 forecasters from 12 WFOs across all 4 NWS regions in the continental United States. A total of 27 interviews were conducted in person, and 4 were conducted by phone. The first and second interviews were conducted to pretest the interview guide with a focus on the question ordering, wording, and length. No significant changes to the interview guide were made after these interviews, and thus both interviews were included in the final dataset. All interviews were conducted between February 2017 and December 2017. By the final interview, “saturation” of ideas was reached, meaning no key insights were mentioned that had not been discussed in earlier interviews; this indicates that the sample size is sufficient to generate robust results (Merriam and Tisdell 2016). The median interview length was 76 min (mean = 79 min; range: 42–124 min). Data on the forecasters’ gender, years of experience working in the NWS, current job position, and type of hazardous weather discussed in the interview are provided in Table 1.
b. Qualitative data analysis and reporting
The aim of qualitative research is to understand “how people make sense of their world and the experiences they have” (Merriam and Tisdell 2016, p. 15). We selected this research approach due to the limited state of knowledge about NWS forecasters’ needs for CAM ensemble information. Qualitative research focuses “on meanings rather than on quantifiable phenomena” and with “collection of many data on a few cases rather than a few data on many cases” (Schutt 2012, p. 324). Qualitative data are richly descriptive, with quotes used to depict complex themes that cannot be validly represented through bits of data.
All of the interviews were conducted by the first author, and they were audio recorded and transcribed verbatim. The data were analyzed with a focus on the research goals using a reflexive thematic analysis approach (Braun and Clarke 2006; Braun et al. 2019). Through inductive, iterative analysis, we identified themes related to how forecasters interpret and use model guidance and what are their critical forecast challenges and CAM ensemble needs. The themes identified can capture both “implicit ideas ‘beneath the surface of the data’” and more explicit ideas, which in turn incorporate both the “essence and spread of meaning” (Braun et al. 2019, p. 3), much like ensemble output has an average and a distribution.
Human subjects approval for the observations and interviews was obtained from NCAR’s Human Subjects Committee, and all forecasters consented to participating. Per common ethical human subjects research practices and the human subjects approval obtained for this study, we committed to maintaining forecasters’ anonymity to the extent possible so that they could freely express their thoughts and opinions. Thus, we do not identify the national forecast centers, weather forecast offices, or individuals who participated in this study. All quotes are anonymized and referenced as interviewee number and, for context, type of hazardous weather discussed. For example, the thirty-first interviewee discussed severe weather and thus is referred to as “No. 31-severe.”
The interview data are rich and nuanced due to forecasters’ expertise and roles. Most of the (sometimes lengthy) quotes are presented in tables with alphanumeric references in the manuscript text. This data presentation approach is intended to facilitate manuscript readability while also preserving the forecasters’ “voices” and the richness of the information they provided.
Last, the data collected, analyzed, and reported here represent the forecasters’ interpretations, perceptions, and experiences as they are from their perspectives—termed an emic focus (Schutt 2012)—not as others might believe they ought to be. Although a reader might not understand or agree with a perspective, understanding the state of knowledge, beliefs, and practice can help identify where improvements might be made.
3. Forecasters’ needs for specific CAM ensemble guidance
A range of needs emerged from the participant observation and interview data about specific CAM ensemble guidance that forecasters would like to have. These needs emerged directly and indirectly as the forecasters performed their forecast processes and discussed their context, roles, and goals.
From the participant observations, two common themes arose about guidance needs: 1) information to help forecasters transition from utilizing deterministic guidance to probabilistic guidance, and 2) map-based guidance about timing of hazardous weather threats. The former represents a class of needs that different types of products might fulfill, while the latter represents a need for a specific type of product. These two themes from the participant observations informed development of three prototype products (the combination plot and the two timing plots) that were used in the interviews. Sections 3a and 3b discuss these two themes and the associated prototype products in greater depth. Recall that the first part of the interview guide asked about participants’ forecast processes, guidance used, and CAM ensemble guidance needs, and then the prototype products were shown and discussed. Thus, the text below includes forecasters’ mentions of information needs before they were shown the products as well as their subsequent feedback about the products. Forecasters expressed CAM ensemble guidance needs beyond those discussed in sections 3a and 3b; these needs are summarized in section 3c.
a. Facilitating the deterministic-to-probabilistic transition: The “combination” plot
It is sometimes stated in the weather community—by forecasters about the public and by model developers about forecasters—that people do not want or use probabilistic forecast information because they prefer simpler, single-valued forecasts or cannot understand uncertainty information (e.g., Hirschberg et al. 2011). Such thinking places the burden on the recipient. However, lack of uptake of any forecast product suggests the information is not useful for some reason; the need to understand those reasons shifts the burden back to the information developer.
Most forecasters who were observed and interviewed understand and believe, both theoretically and practically, that ensemble-based guidance confers benefits that deterministic guidance does not. Still, the transition for forecasters and forecast offices from using deterministic guidance to ensemble-based guidance can be challenging, particularly for some types of information.
When postprocessed probabilistic guidance (e.g., neighborhood probability of exceeding some parameter threshold) was shown or discussed during the participant observations, barriers in using the information emerged from many of the forecasters. Forecasters are used to thinking and working spatially and to assessing specific atmospheric features in three and four dimensions. Most current forms of probabilistic guidance do not map, literally or figuratively, onto how forecasters view the atmosphere (although there is ongoing work to address this, e.g., Rautenhaus et al. 2018). Forecasters also expressed that, for them, postprocessed probabilistic products are a “black box,” with no easy way to understand what data went into the product or how the resulting information was generated. Moreover, forecasters know that models have limitations and inaccuracies, and thus part of their forecast process includes assessing critical model errors (see section 4). Doing so is made more difficult with probabilistic guidance, especially guidance that is postprocessed. Because forecasters are scientists, they inherently want to understand how things work. When they cannot easily understand the workings of a probabilistic product or evaluate its accuracy, this reduces their trust in information and their willingness to use it.
To help forecasters overcome the barriers described, the ESRL research team developed a prototype product that we termed the “combination” plot, which includes probability contours derived from postprocessing plotted over output from a single deterministic member. In Fig. 1d, an example combination plot is provided that has the 10th, 50th, and 90th percentile neighborhood probability contours of 3-h rainfall exceeding flash flood guidance (FFG)3 overlaid on deterministic 3-h rainfall amounts from the control member. Similar plots for winter and severe weather are shown in Figs. 2d and 3d. Such plots could be generated in multiple ways, such as with contours of either point or neighborhood probabilities overlaid, or with an ensemble mean, member, or maximum underlain.
When shown the combination plot, most of the forecasters interviewed expressed that they liked it and found it useful (Table 2). Forecasters’ favorable comments indicated that the combination plot helped them better understand and have more confidence in the probabilities (Quotes 2A–2B) and it helped them recognize lower probability risks in some areas (Quotes 2C–2D). Some of the forecasters indicated they would like the ability to scroll among all the different ensemble members to see each of them as an underlay (Quote 2E), and some forecasters liked being able to compare the probabilities against the ensemble control member, as in the prototype plot developed (Quotes 2E–2G).
However, not all forecasters liked the combination plot. One common reason was that it comprises “way too much information” (No. 21-winter) with “too much going on” (No. 24-winter) and thus would require a lot of time for the forecaster to comprehend. A second reason, expressed by a few forecasters, related to their perceived disconnect between looking at a single piece of guidance and an envelope of guidance from an ensemble (Quotes 2H–2I). For instance, Forecaster No. 11 (-winter) indicated, “I don't care what members are showing what. If they all have equal probability of occurrence, then it doesn't matter.” These forecasters do not seem to need a product to transition from using deterministic to probabilistic information because of how they understand and value information from CAM ensembles.
b. Developing new CAM ensemble guidance: Map-based threat timing information
A forecaster’s job is not only to identify whether and where hazardous weather will occur but also when it will occur. The timing of hazardous weather at different locations is inextricably linked to the risk it poses and thus to how a forecaster assesses and communicates the risk to core partners and other users. For instance, forecasters evaluate the risks from precipitation by evaluating amounts over some time frame. This can include assessing multiple waves of heavy precipitation to determine whether there is a threat of flooding or flash flooding (Table 3, Quotes 3A–3B) or assessing whether there is sufficient snowfall to warrant a watch, warning, or advisory. Forecasters also evaluate if hazardous weather might occur at times when people are particularly vulnerable to harm, such as when people are exposed outside with few protective options (Quote 3C), or when weather interacts with transportation systems to amplify negative consequences (Quote 3D).
Because of the importance of the timing of weather at different locations, our data suggest that often forecasters are seeking information about the timing of hazardous weather represented spatially. To ascertain and convey such information, forecasters interrogate observations (e.g., radar reflectivity), deterministic guidance from sequential runs valid at a given time [i.e., d(prog)/dt or time lagged] or from multiple models (i.e., poor man’s ensemble), and ensemble plume diagrams which provide guidance about a particular parameter (e.g., rainfall amount) over time at a geographical point (Quotes 3E–3F). Although these approaches work for forecasters, missing from their toolkit is map-based ensemble-derived information about threat timing that could more directly meet their needs.
To evaluate the possible benefits of providing this kind of information, the ESRL research team developed two prototype products to illustrate different types of threat timing information that could be derived from an ensemble. One product, termed the “onset” plot, maps the ensemble mean hour of the day when a threshold is first exceeded at a point, with blank areas indicating that no ensemble member exceeds the threshold. The prototype plot shown in Fig. 1e is for the ensemble mean onset time that rainfall rates exceed a threshold. Snowfall rates were shown for the winter weather scenario (Fig. 2e), and updraft helicity values were shown for the severe weather scenario (Fig. 3e). These prototype figures have the data plotted over an 11- or 12-h window, but ensemble mean onset time could be generated over shorter windows to address multiple rounds of hazardous weather. The other product, termed the “duration” plot, maps the ensemble mean number of hours that a threshold is exceeded at a point. The prototype plot shown in Fig. 1f is for the ensemble mean number of hours that rainfall rates exceed a threshold. Snowfall rates were shown for the winter weather scenario (Fig. 2f), but no duration plot was generated for the severe weather scenario due to the isolated and transient nature of severe convection. These prototypes figures have the data plotted over shorter (4 h, Fig. 1f) and longer (11 h, Fig. 2f) windows. The information in Figs. 1f and 2f does not necessarily represent the sequential number of hours that a threshold is exceeded, but such information could be generated. Moreover, the prototype duration plots do not convey the number of ensemble members that contribute to the average values shown, which could be misleading in instances where the information plotted is from only one or few members; thus future prototypes could combine the duration information with probability of occurrence. For both the onset and duration plots, the timing of other parameters or thresholds could be extracted and mapped. Additionally, the time windows over which the output is generated could be a customizable setting of the interface through which the forecaster analyzes and displays model data.
Most of the forecasters interviewed liked the map-based threat timing information, and they discussed ways the guidance could be useful to them (Table 4). The forecasters mentioned, for example, ways that the mean onset and duration products could help them make a decision about the start time of a warning (Quote 4A) or whether flooding is a risk (Quotes 4B–4C). Many forecasters described how the timing products could help with their messaging to the public and to their core partners (Quotes 4D–4G). It also was mentioned how the timing products, particularly the onset information, could help determine staffing needs for severe weather operations (Quote 4H–4I).
The few forecasters who were less enthusiastic about the map-based timing guidance gave different reasons for their views. One forecaster described the ensemble mean onset timing product as “a little non-intuitive […] because you’re representing time as space” (No. 15-severe). This same forecaster also questioned whether convection that occurs later in time is being covered up by early convection, a sentiment that also was raised by another forecaster (Quote 4J). A third forecaster expressed confusion about whether long tracks of updraft helicity represent the longevity of a single, strong storm or storm regeneration (Quote 4K).
c. Additional needs for specific CAM ensemble guidance
Beyond the types of information represented in the combination and timing plots, the forecasters interviewed discussed a number of additional types of information they would like to have from CAM ensembles, as well as ways that they would like to be able to manipulate or customize that information. Many forecasters volunteered these needs early on in the interview, but viewing the prototype CAM ensemble plots (Figs. 1–3) spurred additional ideas.
We do not discuss these additional suggestions in this article due to space considerations, but a synthesis is provided in Table 5, clustered into two categories. The category “model parameters and outputs” comprises forecasters’ requests for extraction of different characteristics from the ensemble distribution (e.g., earliest onset time of a hazard from the ensemble) and for additional model fields (e.g., ensemble output of precipitable water). The category “statistical and post-processing” comprises forecasters’ desires to be able to interrogate the ensemble system on-the-fly for criteria that are relevant to their forecast problem of the moment (e.g., query-able exceedance thresholds). The list of needs is not meant to be exhaustive but rather is meant to offer insight into some of the additional types of information and ways of interrogating it that forecasters suggested may be useful to them.
4. Forecasters’ needs for CAM ensemble guidance in support of their shifting role toward IDSS
Coincident with the provision of CAM guidance are ongoing changes in the NWS forecasting environment. Chief among these is a change toward the “partner and customer-centric service delivery model” (NWS 2019) of providing impact-based decision support services. IDSS “requires forecasters to ‘go above and beyond the forecast’ to deliver improved service to government agencies” (NWS 2017b) through the “provision of relevant information and interpretive services to enable core partners’ decisions when weather, water, and climate have a direct impact on the protection of lives and livelihoods” (NWS 2018a). The forecasters interviewed regularly referenced IDSS, including how they serve in this role and what challenges they encounter. Because IDSS involves decisions at local levels with NWS core partners and because there is uncertainty in hazardous weather at those spatial and temporal scales, CAM ensemble guidance has the potential to meet forecasters’ IDSS needs.
In discussing the provision of IDSS, some forecasters explicitly emphasized it as a shift from the past in how they do their job, including in how they spend their time and how they think (Table 3: Quotes 3B, 3D, Table 6: Quotes 6A–6B). Forecaster No. 28 (-rain) articulated this shift through an example of a rainfall event that may only produce a moderate amount of rain but that can be high impact in certain circumstances and thus “causes partners all sorts of grief” (Quote 6B). Several other forecasters described their IDSS role through examples of the kinds of information their partners need. Often, these needs pertain to the timing of the risks posed by hazardous weather, as noted in section 3b. Forecaster No. 30 (-rain) succinctly explained that, for significant weather events, “core partners want to know when will it start, when will it be at its worst, and when will it end?” This sentiment was echoed by many of the other forecasters who participated in this study. More specific partner needs for information about threat timing include timing of waves of precipitation in order to gauge flood threats (Quote 3B); timing of convective weather, especially lightning, in order to protect people during outdoor events (Quote 3C); and whether and when snowfall, especially high precipitation rates, will occur in order for departments of transportation to plan for plowing (Quotes 3D, 4G, 6C). As part of their IDSS role, multiple forecasters also discussed providing to partners the most likely scenario for hazardous weather coupled with goalposts, typically in the form of a best-case and worst-case scenario (Quotes 6D–6E; see also Novak et al. 2008, 2014). Forecaster No. 28 (-rain) explained that when he provides such scenarios to his partners, he draws on ensemble guidance in order to determine how to qualitatively express forecast uncertainty through words and tone, a process he characterized as nuancing the deterministic solution (Quote 6F).
A related theme that commonly emerged as the forecasters discussed providing IDSS is that of conveying forecast confidence (Table 7). Many forecasters discussed how they use (or see the potential to use) CAM guidance to assess and communicate confidence. One way forecasters discussed this topic was using probabilistic output from CAM ensembles to shape their own confidence. For instance, forecasters mentioned that higher probabilities (e.g., of 80% or greater of a parameter that is significant for their forecast process) would increase their confidence, which they may share with users (Quotes 7A–7B). Additionally, Forecaster No. 15 (-severe) discussed how probabilistic guidance can help convey confidence to some partners for low-likelihood, high-consequence risks (Quote 7C). The other way forecasters discussed using CAM guidance to inform their confidence is by looking at model trends for signals of consistency over time (Quote 7D–7E), monotonic changes (Quote 7F), or sharpening of the forecast (Quote 7G). Forecaster No. 30 (-rain) also discussed that she assesses model consistency by examining ensemble member clustering, or lack thereof, as well as continuity among different models (Quote 7D). Forecaster No. 19 (-severe) extended the idea of determining confidence by evaluating model trends from deterministic output to CAM ensemble output, indicating that he would look for trends in hourly modeled probabilities of updraft helicity to determine if and how they are changing in space and time (Quote 7H).
Although these quotes illustrate that forecasters have developed some understanding about and strategies for providing IDSS, several forecasters discussed challenges for serving in this role (Table 8). One challenge that was raised by a few forecasters is the variability among partners in what constitutes relevant information and interpretive services (Quotes 8A–8B). For some, this challenge is exacerbated by the shift from a long period of the NWS focusing on quantitative, verification-based performance to a more recent focus on impacts, which are more qualitative (Quote 8B). For instance, one forecaster explained that official winter storm warning criteria for his area of responsibility is 8 in. in 12 h or 12 in. in 24 h, but that “moving forward into more of a DSS world, those thresholds are stretched up and down” (No. 2-winter).
Forecasters also discussed multiple challenges specific to using CAM ensemble and other guidance to provide IDSS. One challenge is how forecasters can interpret and communicate a hazardous weather threat in a way that is meaningful for partners when there is significant uncertainty, such as when there is substantial spread in the ensemble solutions (Quote 8C) or when there is a low probability of a weather event with significant societal impacts (Quote 8D). Related to the difficulty of communicating uncertainty to partners is the challenge of when probabilistic products are not calibrated to be reliable (see section 5 for further discussion), especially when those errors have significant implications, such as money spent on hazard mitigation (Quote 8E) or a visible false alarm to thousands of people (Quote 8F). A final model-based challenge for IDSS is when a marginal event occurs that was not captured by the models but that still is high-impact to partners, such as the snow event example discussed by Forecaster No. 2 (-winter) that can lead to flight diversions at the local airport (Quote 8G). Precisely because the models are imperfect and are tools to aid forecasters in better predicting threats, some forecasters emphasized the need to retain their meteorological skills and blend those skills with guidance in order to effectively communicate risks to their users to reduce harm (Quote 8G–8I).
5. Forecasters’ needs for model verification and training
In addition to needs for specific CAM ensemble information (section 3) and information relevant to their IDSS role (section 4), the forecasters interviewed expressed multiple needs for model verification and calibration and for being better trained to use new guidance. As with previous results, these needs emerged both implicitly and explicitly, and in reference to model guidance generally and CAM ensembles specifically.
a. Needs for CAM ensemble verification and calibration
Forecasters know that numerical weather prediction models are imperfect. They discussed many examples of model errors that they have learned about experientially, for example, with precipitation amount, duration, and placement or with certain synoptic situations (Table 9, Quotes 9A–9D). Many forecasters also discussed a broad, general desire to have objective information about model biases and about when a model does and does not perform well (Quote 9D).
In addition to a general need to be informed about model skill, forecasters discussed a number of more specific model verification needs. Some forecasters indicated that in order to use model output, they need to know what it “means.” In the context of probabilistic output, this is one way they expressed wanting to know if the probabilities were calibrated to be reliable (Quotes 9E–9F). This is also implied by Quote 8E from section 4 where Forecaster No. 14 (-winter) indicated that “My biggest problem with [percent probabilities] is 90 doesn’t always mean 90.” This quote further reflects that a forecaster can have low confidence in a high-probability forecast. Other forecasters more directly expressed a desire to have reliable probabilities (Quote 9G). These quotes also reflect forecasters’ need for training about how to interpret probabilistic guidance, which is discussed further in the next subsection.
Forecasters also expressed wanting to know what the verification “means” with respect to what dataset the metrics were calculated over, including over what timeframes and in what locations (Quotes 9H–9I). Related to this, Forecaster No. 13 (-severe) indicated that he wants model verification to have “more honesty,” meaning done and shown for a collection of events versus only for a few successful events (Quote 9J). Forecasters also want to be able to stratify the data so that they can assess model performance for the different situations that they forecast for, mapped onto the scenario-based ways that they think about when forecasting. These scenarios include different atmospheric forcing mechanisms (Quotes 9K–9L). They also include weather situations that are high-impact in their forecast area, such as when more than one or two inches of rain is forecast (Quote 9L) or when snow is forecast with high winds, in other words, in the situations that Forecaster No. 24 (-winter) describes as “when it really matters” (Quote 9M). These weather scenarios are high-impact because they can significantly affect society and partners’ decisions (Morss et al. 2008), but they are not statistically extreme events, for which model verification would be challenging at best. In the absence of being provided with scenario-based objective verification information, forecasters aim to calibrate guidance for themselves (Quotes 9N–9O). But many forecasters acknowledged how difficult it is to do their own verification due to lack of time and lack of ability to store model data for analysis (Quotes 9P–9Q).
The forecasters discussed a variety of other types of model verification that could help them with utilizing CAM ensembles (and other model guidance) in their forecast process. Examples include measures that compare current output to the model’s climatology (Quote 9R), measures of model run-to-run consistency and trends (Quote 9S), and measures that quantify spatial or temporal errors (Quote 9T).
b. Needs for CAM ensemble-specific training
The need for training on how to interpret and use model guidance emerged pervasively from the forecasters, revealing the important and multifaceted nature of this issue. Forecasters’ opinions were based on their experiences with on-the-job training about forecast tools4 generally and on their thoughts on training about CAM ensemble guidance specifically.
A common theme from the forecasters when discussing training about models, including CAM ensembles, was the need for training that is “speaking, very much, the language of a Weather Service employee” (Quote 9A). This means providing less detail (but not zero details) about the “guts of the model” and more content that is clearly and directly relevant to forecast operations, specifically how and when model guidance applies to and benefits their forecast processes (Table 10, Quotes 10A–10C). Forecaster No. 18 (-severe) summarized this as needing to know “what is this tool, how can I use it, when can I use it, what are the strengths, and what are the limitations of it [because] when there are so many tools that are out there, it’s easy to get lost” (Quote 10B). Specific to ensemble output, some forecasters specifically mentioned the importance of training to address the challenges they experience with interpreting and using probabilities (Quotes 10D–10G), particularly neighborhood probabilities, which Forecaster No. 14 (-winter) described as having “two qualifiers,” the probability and the area, making it difficult to understand (Quote 10F). Correspondingly, Forecaster No. 1 (-winter) observed that forecasters do not look at neighborhood probability products very often and thus would need to be trained on how to understand them if they are to be provided (Quote 10G).
The forecasters had varying opinions about the best mechanisms for training, especially in light of their workloads. Regardless of how a forecaster learns about a new product, they noted that it can be helpful to have readily available and succinct ways of recalling what a given piece of guidance means, especially given the vast number of model products available coupled with the fact that a forecaster may utilize a product only intermittently. An interactive feature, such as a question mark on a web page that a forecaster could click on to obtain a brief description of each product, is one simple refresher mechanism that was suggested (Quote 10H).
In addition to discussing formal training, forecasters described a more general need for improved communication about CAM ensembles. They noted that there are multiple communication sources and channels for learning about new or updated model information, and that this communication is sometimes fragmented, unreliable, slow, or ineffective (Quotes 10I–10K). The ramification of such ineffective communication is that forecasters sometimes are left unaware about existing tools that could aid their forecast process. In other words, even if new CAM ensemble guidance is developed that could be useful to forecasters, the process of operationalizing it breaks down if they are unaware of it.
Understanding forecasters’ needs for training and communication about model information, including CAM ensembles, was not a focus of the research conducted here. Nevertheless, we found that addressing these needs is a critical component of enabling more effective use of CAM ensembles by forecasters. Our results also suggest that there is not a one-size-fits-all approach—for either the population of forecasters to be trained or for the content of the training. Thus, more focused work on this topic is needed to advance the development of useful and usable CAM ensemble guidance for forecasters.
6. Summary and discussion
This paper develops knowledge about NWS forecasters’ practices, perspectives, and decision contexts in order to identify their needs for CAM ensemble information. The research is based on qualitative data collected from over 50 NWS forecasters through participant observations and semistructured interviews.
We found that forecasters’ feedback clusters into three areas, which represent different scales of needs for CAM ensemble information: 1) needs for specific types of products, 2) needs for information that can support forecasters’ roles in providing IDSS, and 3) needs for accompanying model verification information and training.
Needs for specific types of CAM ensemble guidance emerged explicitly and implicitly from the forecasters (section 3). Probabilistic guidance can be perceived by forecasters as a black box that is difficult to interrogate to understand the output, including its errors in any given forecast situation. This finding echoes that from Novak et al. (2008) who found that forecasters prefer to interact with ensemble guidance rather than have black-box output. This suggests a need for CAM ensemble information that bridges deterministic and probabilistic forecast representations. The prototype “combination” plot (Figs. 1d, 2d, 3d) developed by the research team is an example of a way to build this bridge. Some forecasters believed this type of information could help them better understand and evaluate ensemble guidance, but it was a too complex for other forecasters. Thus, refining and prototyping additional “bridge” products for forecasters may be useful. A second type of needed CAM ensemble guidance is for information about the potential timing of hazardous weather. Such information can help forecasters better predict and communicate weather risks (e.g., whether precipitation will last long enough to cause flooding or whether snowfall may occur during rush hour), and thus better support partners’ decision-making. CAM ensembles are well-suited to providing threat timing information—including forecast uncertainty, such as the earliest possible onset of hazardous weather or exceedance of a threshold—in a map format for efficient use by forecasters. When shown the prototype timing plots (Figs. 1e,f, 2e,f, 3e), most forecasters indicated that these types of information could be useful. This suggests that more work should focus on deriving, verifying, and providing such output from CAM ensembles. Forecasters articulated additional specific CAM ensemble needs, synthesized in Table 5. These needs include different parameters from those discussed here as well as a desire for a dynamic system that would allow forecasters to define and generate their own type of model-derived information based on the forecast scenario rather than relying on predefined, static outputs.
Needs for information that can help forecasters provide IDSS emerged as a second key area (section 4). The change in the forecasters’ role toward providing IDSS is significant and salient. Forecasters are increasingly asked to provide information about the possibility of different high-impact weather threats. Such information requests commonly are for timing information and for specific scenarios, and they pertain to partners’ decision-making at refined space and time scales. CAM ensemble output could help forecasters support these IDSS needs. This utility is already being realized to some degree, but there is room for more development of different CAM ensemble parameters and associated forecast uncertainty information to meet these growing needs. Moreover, forecasters discussed that CAM ensemble guidance can help them assess and convey confidence to their partners, a finding that Evans et al. (2014) also reported. However, it is unclear how CAM ensemble guidance shapes their confidence, which suggests that future work should be done to investigate forecasters’ thinking in this regard.
Needs for CAM ensemble-specific verification and training were conveyed by many forecasters (section 5). Objective model verification information helps forecasters understand how much to rely on different model guidance, where, and when. Such needs discussed by forecasters included understanding model biases; knowing whether probabilistic guidance is calibrated to be reliable; knowing over which cases, timeframes, and geographies verification was conducted; and having ways to stratify the verification statistics. There has been an increased focus on augmenting traditional verification metrics with metrics that are meaningful to users (e.g., Davis et al. 2006a,b; Gilleland et al. 2010; Sobash and Kain 2017; Wolff et al. 2014). The needs expressed here reinforce and expand on this notion for additional verification metrics, for CAM ensembles and beyond, that are forecaster-oriented such that they are closely aligned with forecast processes and situations.
In order for forecasters to be able to use CAM ensembles, they need to know what information is available, how to access it, and what it means. This need for better and more CAM ensemble-specific training emerged strongly from forecasters. This result is all the more noteworthy given that no interview questions explicitly asked about training. Forecasters indicated that training content can be disproportionately heavy on how the model was developed (i.e., the research) and light on how to use the model output (i.e., the operations). These forecaster training needs mirror their verification needs in that there is a need for the information to be packaged in a way that is forecaster-oriented. Thus, a shift toward training about the basics of guidance coupled with more information about how output can be used in operations, with forecasting examples, would be beneficial for forecasters. This finding mirrors that from Novak et al. (2008) from over a decade ago, which suggests that this critical need of the forecasters requires more attention.
There are limitations to the research conducted here. The research was simplified in that it did not elicit forecasters’ feedback about the prototype products during their actual, real-world forecast process or relative to the existing observational and model data that they have. Although we took care to gather data from NWS forecasters from across the United States about multiple types of hazardous weather, the results cannot be generalized to all forecasters in all areas for all hazard situations. Some CAM ensemble guidance may be of limited use in some situations, such as neighborhood-smoothed probabilities where there are tight gradients (e.g., areas of complex topography, lake-effect snow, bands of heavy snow or rainfall, flash flood guidance values), or when the number of ensemble members is so large that paintball plots become unintelligible. As such, we do not suggest that the CAM ensemble products discussed here—or any product for that matter—serves all forecaster needs all of the time. Furthermore, we recognize that meteorological research will yield more complex CAM ensemble systems in the future, with more members and finer resolution. Also, new approaches to postprocessing are needed because of limitations of neighborhood techniques; related work is ongoing (Blake et al. 2018; Dey et al. 2014, 2016), and future work could leverage these efforts through additional product development for and evaluation by forecasters. Despite these limitations, we believe that many of the results reported here will apply in the future—particularly the results about forecasters’ needs for information about hazardous weather timing; for forecast platforms to dynamically interrogate ensemble guidance; for types of information that help provide IDSS; and for model verification, calibration, and training. The forecaster needs summarized here represent part of the CAM ensemble information development process, and we propose that such research with forecasters should continue in conjunction with future model development efforts.
The research conducted here has important implications for efforts to develop new CAM ensemble information. Such efforts often are motivated by a desire to reduce forecasters’ information overload by getting them to adopt new tools that are intended to streamline their forecast process. These motivations are well intentioned. However, the real challenge to address is arguably the continued creation and provision of new products without in-depth understanding of the forecasters’ point of view. In other words, we propose that instead of asking how to get forecasters to adopt new tools, the question that ought to be asked is how to effectively create and transition products that forecasters actually want, need, and can use.
Our approach to doing this was to integrate social science research into the traditional R2O process. We employed a risk communication research approach wherein developing new CAM ensemble guidance that is useful and usable to forecasters starts with understanding their job and decision contexts—that is, the roles and functions they perform, the knowledge and information that they do (and do not) possess to carry-out those responsibilities, and their experiences, values, and cultures that comprise the backdrop in which they operate. Utilizing this approach yielded information about forecasters’ perspectives and needs beyond their direct feedback about specific products and the concrete informational needs that they can articulate. Robustly incorporating such social science research allowed us to look more broadly and listen more deeply in ways that reveal critical mismatches, gaps, and potential solutions. Some of these pertain to a specific, needed piece of information, such as the combination plot or hazard timing information. Others are broader, such as our findings about the relevance of CAM ensembles in forecasters’ complex and evolving role to effectively provide IDSS, and the importance of verification and training for forecasters to effectively utilize the guidance that is available.
The social science research reported here was part of a bigger project and thus was informed by the complementary research into and expertise pertaining to CAM ensemble modeling and postprocessing, verification, and operational forecasting. Such interdisciplinary efforts for improving human weather forecasting, including with probabilistic information, have been advocated for several decades (Murphy and Winkler 1984; Doswell 2004; Demuth et al. 2007; NAS 2017). Although interdisciplinary research can prove difficult to do meaningfully (Morss et al. 2018), the benefits are worth the effort.
The authors kindly thank all of the NWS forecasters who shared their invaluable time, expertise, and perspectives for this research. We also thank Lance Bosart, reviewer Steve Zubrick, and two anonymous reviewers for providing helpful comments on the manuscript. This research is supported by National Oceanic and Atmospheric Administration (NOAA), U.S. Weather Research Program (USWRP) Research to Operations (R2O) Award EA-133W-16-CQ-0051. The National Center for Atmospheric Research is sponsored by the National Science Foundation.
Publisher’s Note: This article was revised on 29 June 2020 to include an additional affiliation for co-author Jankov that was missing when originally published.
The High-Resolution Ensemble Forecast (HREF) v2 became available in AWIPS in late 2017.
The U.S. National Weather Service defines “severe weather” only as tornadic storms or thunderstorms that produce hail greater than or equal to 2.54-cm (1-in) and/or winds greater than or equal to 50 kt (1 kt ≈ 0.51 m s−1; 58 mph) (NWS 2018b). Winter weather and heavy rainfall are not considered to be severe weather.
Flash flood guidance (FFG) is a spatially variable, two-dimensional field. FFG is defined as “a numerical estimate of the average rainfall over a specified area (or pre-defined grid) and time interval required to initiate flooding on small streams” (NWS 2017a, p. 9; see also Clark et al. 2014; Schmidt et al. 2007).
Many forecasters had completed GOES-R training shortly before the interviews were conducted, and thus this experience with learning about the new satellite platform and the observational data provided by it influenced their views.