Freezing precipitation, including freezing rain, freezing drizzle, and ice pellets, presents substantial hazards to transportation, energy, and infrastructure. Most quantitative climate-length (>30 years) datasets for freezing precipitation are obtained from in situ instrumentation, such as National Weather Service (NWS) automated observing systems or the NWS Cooperative Observer Program, that can be spatially and temporally inhomogeneous. This work investigates whether the 32-km North American Regional Reanalysis (NARR) is a viable dataset for developing a comprehensive and high-resolution spatially gridded dataset for freezing precipitation and its associated meteorological environment, with a focus on the south-central United States. NARR includes categorical precipitation type as a variable; to permit a translatable method across other gridded multidimensional reanalyses, however, a multialgorithm approach is also used to extract environmental conditions, event counts, and liquid water equivalent (LWE) for freezing precipitation, defined as total accumulated freezing rain and ice pellets. The resulting datasets are evaluated spatially and temporally against hourly and daily event counts and LWE compiled from 13 first-order NWS stations, the National Centers for Environmental Information Storm Data product, and “meteorological phenomena identification near the ground” (mPING) observations. Very good statistical agreement is evident for many of the station sites, and NARR is able, in most cases, to reproduce years with heavy-ice events and the spatial extent of such events. Climatological freezing precipitation tends to be underestimated in the western subdomain and overestimated in the south. It is concluded that the derived datasets could be a useful tool for climatological research and hazard analysis, with some caveats.
Freezing precipitation is an infrequent but potentially damaging hazard in the south-central United States. The accretion of freezing rain on power lines, structures, and roadways during ice storms causes significant economic damage and travel disruption (Call 2009). Ice storms in this region have resulted in hundreds of millions of dollars in damages over the past decade (e.g., Grout et al. 2012). During the winter of 2013/14, several cold air outbreaks in the southern plains and southeastern United States produced widespread mixed winter precipitation. Numerous power outages and hundreds of traffic accidents, road closures, and delays were reported [National Centers for Environmental Information (NCEI) Storm Data]. Around the nation, approximately three-quarters of state departments of transportation overspent their planned budgets on winter maintenance because of the frequency of winter weather (Slone 2014).
Freezing precipitation—especially, freezing rain—is difficult to measure. During heavy icing, automated weather stations may experience power disruption and freezing of moving instrumentation (e.g., anemometers), leading to missing data. Nonetheless, National Weather Service (NWS) Automated Surface Observing System (ASOS) stations, which became the primary in situ weather-observing system in the mid–late 1990s, have an icing sensor that is able to report freezing rain (Jones et al. 2004; Ramsay and Laster 1995). Freezing precipitation was previously reported manually by observers, such as those in the the Cooperative Observer Program (COOP), which continues today. Despite ASOS automation, ice pellets are not recorded automatically and must be manually augmented. Not all ASOS stations are capable of the level of staffing necessary for accurate and timely augmentation, leading to underestimated ice-pellet reports (Elmore et al. 2015). Despite these limitations, climatological-length data for freezing precipitation are based on the in situ observations from NWS ASOS and COOP stations, which are of fairly high density but are spatially and temporally inhomogeneous. First-order stations, situated close to major cities and airports, possess the most complete records (e.g., Changnon 2004). Spatial freezing-precipitation climatological normals from in situ data have been developed for the contiguous United States and various subregions by Gay and Davis (1993), Rauber et al. (2001), Changnon and Karl (2003), and Cortinas et al. (2004). Changnon developed a freezing-rain dataset that compiles numerous station-based hourly and daily ice counts extending from the early twentieth century through 2000 (Changnon 2004). Changnon also developed a 50-yr dataset using insurance data and railroad ice-accretion estimates (Hay 1957; Changnon 2003). Spatial information for major ice storms has been more recently developed by the Cold Regions Research and Engineering Laboratory (2015). The storms can be examined using a GIS interface, showing the approximate spatial extent of 1-in-50-yr return periods for ice accretion.
These resources are useful to a range of applications, but it is advantageous to have a dataset that is of high spatial and temporal resolution and is spatially gridded, providing consistent spacing between data points. In addition, products that include both occurrence of freezing precipitation and the attendant meteorological parameters provide users with a greater range of applications and opportunities for data analysis and decision-making. This could include the analysis of physical, environmental, and meteorological contexts of freezing precipitation or deriving spatial information that is not resolved by the coarse station network. Blunden and Arndt (2011) pioneered using reanalysis data to examine the veracity of spatially gridded freezing precipitation—specifically, freezing rain. They selected the high-resolution (32 km) North American Regional Reanalysis (NARR) (NOAA/NCEP 2004; Mesinger et al. 2006) because it includes estimates of precipitation type using a categorical yes/no binary “flag.” Their research showed promising results as based on comparisons with stations sited in Oklahoma and Arkansas, but the study was not pursued further at the time.
Here, this line of inquiry is extended by collecting and analyzing NARR-based freezing precipitation, defined as the total accumulation of ice pellets and freezing rain, to develop a validated, subdaily dataset of freezing-precipitation frequency and accumulation from January 1979 to April 2014 across the south-central United States. We apply two approaches to determine precipitation type: 1) we use the native NARR-categorical estimates calculated from application of the NCEP algorithm applied to the Eta Model, and 2) we implement three common precipitation-type algorithms (partial thickness, Bourgouin, and Baldwin and Contorno; these algorithms are described in more detail in section 2), applied separately to NARR and modified slightly for the regional environment. Both methods are subsequently compared with NWS station observations, along with Storm Data publications and crowdsourced “meteorological phenomena identification near the ground” (mPING; https://mping.ou.edu) reports (Elmore et al. 2014) for select case studies, to determine the reanalysis spatial and temporal veracity, specific event precision, and freezing-precipitation intensity.
The multialgorithm method was developed primarily to be used in other gridded datasets for which precipitation type is not available. In particular, we intend to establish an approach that could be applied to subdaily regional climate projections. Validation of separate algorithms applied to NARR, in addition to the available categorical precipitation output, clarifies the representativeness of both with respect to observations and ascertains whether use of more than one algorithm is able to improve upon the native NARR estimates. Section 2 describes the methods used; section 3 details the results of various validations of NARR trends, seasonal distribution, extremes, and spatial coverage; and section 4 discusses key conclusions and planned future research.
2. Data and methods
a. Freezing-precipitation definition
Freezing precipitation was defined as occurrence and accumulation of both ice pellets and freezing rain. This broad definition was necessary given the large uncertainties that are inherent in algorithm classification, especially for these phase types (e.g., Reeves et al. 2014). The impacts of freezing rain and ice pellets are different, with freezing rain accreting and ice pellets collecting (Lackmann 2011, p. 225). Nonetheless, at this time, it is not possible to extract the two types separately with confidence. In many winter storms, these precipitation types occur in close spatial and temporal proximity, owing to complex microphysical processes related to the thermal profile, heat transfer, hydrometeor size, and vertical velocity.
b. Reanalysis data and algorithms
The NARR is a gridded meteorological dataset that extends from 1979 to the present time. It uses the NCEP–NCAR global reanalysis (Kalnay et al. 1996) to drive the regional Eta Model, which has a horizontal grid spacing of 32 km and 45 vertical (sigma) layers. Observations, including hourly precipitation from rain gauges, radiosonde temperature, moisture and wind data, radiances, snow cover, and surface measurements, are assimilated into the model at regular intervals. The National Oceanic and Atmospheric Administration (NOAA) Earth Systems Resources Laboratory file transfer protocol portal (ftp://ftp.cdc.noaa.gov) served the variables required to estimate precipitation type. They included 3-hourly temperature (surface/2 m and aloft), precipitation, precipitable water, geopotential height, and surface air pressure. Data were extracted for the domain encompassing 25°–40°N and 110°–89°W (Fig. 1) and were regridded using bilinear interpolation from the NARR-native curvilinear grid (2D latitude/longitude coordinates) to a rectilinear grid of approximately 30 km using the NCAR Command Language (NCL). Available NARR data had 29 vertical pressure levels, with 16 in the lowest 500 hPa. Intervals of 25 hPa between 1000 and 700 hPa and 50 hPa between 700 and 500 hPa were linearly interpolated to 20 hPa up to the 600-hPa level and to 25 hPa thereafter to 500 hPa, increasing the density of levels available for the precipitation-type calculation.
Freezing precipitation was first obtained directly from NARR output (hereinafter “NR-Cat”). The NARR contains estimates of precipitation type, including snow, freezing rain, and ice pellets. The calculation of precipitation phase in the Eta Model is based on the Baldwin and Contorno (1993), or NCEP, algorithm. These estimates are expressed as the categorical presence of a given precipitation type for each output time step over a given reforecast interval, with values between 0 (no) and 1 (yes). A value of 1 implies that every output time step over a period predicted the given precipitation type. For 3-hourly data, this value is typically ~1 if a given precipitation type was predicted; however, for categorical freezing precipitation (the sum of categorical ice pellets and freezing rain), a lower threshold of 0.1 was applied, capturing the majority of instances when the model simulated freezing precipitation.
Separate estimates for freezing precipitation were calculated as the mean of three precipitation-type algorithms (hereinafter “NR-Alg”): 1) partial thickness, 2) that of Bourgouin (2000), and 3) that of Baldwin and Contorno (1993). The selected algorithms were chosen on the basis of their ability to be computationally efficient and implementable using data with a variety of vertical resolutions. Use of multiple algorithms may be superior to relying on any one approach (Cortinas et al. 2002), and different algorithms may also perform better or worse in different subregions. A discussion of each algorithm used and some simple adjustments are provided in sections 2b(1), 2b(2), and 2b(3), and the algorithms and adjustments are summarized in Table 1.
For NR-Alg, winter precipitation, whether freezing precipitation or snow, was assumed when the given algorithm/categorical-precipitation-type (NARR) condition was satisfied, surface temperatures were <0.25°C, and precipitation accumulation was ≥0.1 mm in the prior 3 h. The surface air temperature was permitted to slightly exceed 0°C to account for possible model surface warm bias, observed in NARR during some freezing-precipitation case studies—for example, 9–11 December 2007 (Mullens 2014). Conversely, for NR-Cat, the surface temperature threshold as defined in the Eta Model NCEP algorithm was not adjusted.
To identify possible bias in NARR 2-m temperatures, daily mean temperature Tave, maximum temperature Tmax, and minimum temperature Tmin were obtained and were compared with a gridded observational dataset developed by B. Livneh and colleagues and described in Livneh et al. (2013). The Livneh data were interpolated to the NARR grid, and monthly mean differences (1979–2011) were extracted for temperature and number of days with temperature ≤0°C, shown in Figs. 1 and 2 of the online supplemental material. NARR Tmin was warmer than that of the Livneh method for all months evaluated, but Tave and Tmax showed smaller and more variable differences, but with warm biases in portions of the southern high plains and Texas and cool biases for northeastern Oklahoma and into Arkansas and Missouri. Calculation of the mean temperature difference for all station sites in Fig. 1 (not shown) revealed NARR to be nearly 2°C warmer on average for Tmin but closer to 0.15°–0.35°C for warmer Tave and Tmax, respectively. The selected surface temperature threshold for NR-Alg reflects the mean of the latter. The usefulness of this threshold adjustment will vary by region. For the southern subdomain, despite some warm temperature bias, the number of days below freezing showed negligible to slightly positive differences for NARR for Tmax and Tave (but a large underestimate for Tmin). All regions showed seasonal variation in the magnitude of these differences. The potential influence of disparate observed versus modeled (2 m) measurement heights was not explored.
Precipitation accumulations of ≥0.1 mm were used as the threshold for precipitation versus no precipitation, because probability distributions of NARR versus Climate Prediction Center unified gauge gridded observations (24-h accumulation) identified that NARR tended to overproduce precipitation below this threshold (not shown). In addition, NARR can underestimate the magnitude of heavy-precipitation events. Although the representation of precipitation benefits from data assimilation, the grid spacing does not fully resolve subgrid-scale convection (e.g., Becker et al. 2009; Bukovsky and Karoly 2007).
1) Partial thickness
The partial-thickness method (e.g., Cantin and Bachand 1993) empirically identifies critical thickness criteria for two layers, 1000–850 hPa (TH850) and 850–700 hPa (TH700), that support frozen, freezing, or liquid precipitation. Critical thicknesses are derived from radiosonde soundings and are often region specific. The method is typically ineffective for higher elevations (e.g., >1 km above sea level), where the majority, if not all, of the lowest layer intersects the surface (Bourgouin 2000).
Critical thicknesses for the south-central United States (online supplementary Fig. 3) were established from 135 soundings (35 with snow, 27 with ice pellets, and 72 with freezing rain) from eight station locations. A well-defined cutoff between freezing and frozen precipitation occurs near TH700 = 1540 m. Pronounced similarities in thickness values for ice pellets and freezing rain complicate any delineation of these phase types using this method, however. For TH850, freezing precipitation (snow) was most common above TH850 = 1250 m. Critical thicknesses are summarized in Table 1.
In the elevated western portions of the domain (e.g., New Mexico, western Texas, and the Oklahoma Panhandle) where the method is unreliable, an elevation adjustment was developed that estimates freezing precipitation and snow using a set of simple conditions that are based on the surface station pressure at each grid point (Table 1). If the surface pressure Psfc ≥ 900 hPa, then critical thicknesses were used. If Psfc was between 900 (Plower) and 850 hPa (Pupper), then the 800–850-hPa layer must have at least one level with a wet-bulb temperature Tw ≥ 1°C to qualify as freezing precipitation. Even a shallow layer above freezing may promote partial melting of a frozen hydrometeor (Stewart and King 1987), but 1°C rather than 0°C was used to better discriminate freezing precipitation from frozen precipitation. For snow, each 50-hPa layer between 850 and 700 hPa must exhibit maximum Tw ≤ 0°C. The above approach is replicated for surface pressures up to 700 hPa, and the layer intervals are adjusted proportionately. The use of wet-bulb temperature is typically a better measure than air temperature, because the latter may give misleading results in dry layers where evaporational cooling could reduce ambient wet-bulb temperature to below freezing, with implications for hydrometeor size (because of evaporation) and phase.
2) Bourgouin (Bourgouin 2000)
The Bourgouin algorithm calculates areas of lower-tropospheric positive and negative energy from all pressure levels above the surface (derived from soundings or numerical models). The areas are determined from the equation given in Table 1. These energies are proportional to the mean temperature of a layer above or below freezing and the depth of that layer. Hydrometeor phase is assumed to be sensitive to both of these factors, with melting likely with prolonged residence time in above-freezing conditions. The terminal velocity of each hydrometeor is assumed to be constant, as is the vertical velocity. The total positive and negative area energies are the accumulated sum over all warm and refreezing layers in the lower troposphere. Phase type was determined from criteria given in Table 1 [modified from Bourgouin (2000)], and for all conditions for which both the positive areas (PA) and negative areas (NA) exceed zero. The method was adjusted here to calculate PA and NA using the average wet-bulb temperature of the layer (rather than air temperature) for reasons mentioned previously. Snowfall was categorized by a total column PA < 13 J kg−1 and surface air temperature < 1°C, including the possibility of snow and mixed rain/snow.
3) Baldwin and Contorno (Baldwin and Contorno 1993)
The Baldwin and Contorno scheme uses a decision-tree approach to determine precipitation phase. First, the scheme identifies the highest level (at least 70 hPa above the surface) in the lower troposphere that is near saturated, defined as a dewpoint depression ≤ 6°C. The initial phase of the hydrometeors is assumed to be ice crystals if the temperature at this level is less than −4°C and to be liquid if the temperature at this level is greater than −4°C. Second, the algorithm subsequently determines the total magnitude of warm (melting) and cold (refreezing) layers by summing the depths of the individual layers multiplied by their respective average wet-bulb temperatures. When the initial hydrometeor phase type is ice crystals, the warm and refreezing layer magnitudes are calculated with respect to a baseline −4°C isotherm. Snow is anticipated when this area is less than 3000°C m, and values exceeding 3000°C m suggest freezing precipitation or rainfall when the surface temperature is above freezing (0.25°C herein). For layer areas closer to the surface, a baseline of 0°C is typically used. To avoid incorrect classification of snow to freezing rain in deep near-0°C isothermal layers, the algorithm was modified to include a counter that determines how many levels between the surface and 500 hPa have a wet-bulb temperature above 0°C, Snowfall is output when no levels meet this criterion and the area is greater than 3000°C m and is also output in all cases in which the area is less than 3000°C m. Freezing precipitation is output when the algorithm calculation calls for both ice pellets and freezing rain, and with at least one vertical level with wet-bulb temperature >0°C. Additional criteria are summarized in Table 1.
Observations were from 13 first-order NWS stations in the domain (shown in Fig. 1) with subdaily records extending from 1979 to 2014. Three-hourly weather-type observations and 1-h accumulated liquid water equivalent (LWE) data were extracted from NCEI monthly local climate summaries for November–April of each year (Local Climatological Data Publication; http://www.ncdc.noaa.gov/IPS/lcd/lcd.html), and ice counts and LWE were compiled for each month. Snowfall observations were available, but at this stage the algorithm-derived snowfall was not evaluated because it was not the primary focus of this research. LWE served as a proxy for ice accumulation since these data sources did not include regular estimates of freezing-rain and ice-pellet accumulations. The accumulation of ice on roadways and structures is a function of the surface itself and includes factors such as air and surface temperature, wind speed, precipitation rate, and surface heat fluxes. Warmer near-surface temperature and higher precipitation rates can limit ice accretion (Jones et al. 2004). Ice loading on elevated surfaces may be modeled using equations from Jones et al. (2004) or Pytlak et al. (2010). Although LWE is a good indicator of the potential severity of a freezing-precipitation event, two events with the same LWE may not have the same impact on vegetation and infrastructure.
The estimation of 3-hourly and daily ice counts by month involved determining each occasion of freezing rain (abbreviated by NWS as FZRA or ZR), freezing drizzle (FZDR or ZL), ice pellets (IP or PL), or a mixture thereof. When one of these phase types was indicated, it was counted as one 3-h period, and the presence of one or more occurrences in a 24-h period (ending 0000 UTC) counted as one ice day. These counts were then summed by month, by cool-season total (November–April), and by boreal winter [December–February (DJF)]. Because LWE was estimated hourly whereas weather type was logged at 3-h intervals in this dataset, simple objective criteria were employed to ascribe LWE to phase types. LWE accumulation was calculated as follows: 1) If ice is observed at time t, sum the prior three hours of LWE and 2) exclude LWE hours with temperatures above 33°F (0.5°C). In addition, in some cases in which ice was the only precipitation logged on a given day and temperatures were consistently near or below freezing then the daily total LWE was assumed to be ice. The major assumption in this estimate was a constant precipitation type during the prior three hours before the observation. This assumption breaks down during periods of rapid phase transitions but remains relatively robust for multihour, long-duration ice events. The dataset was chosen because the temporal interval was consistent with NARR; nonetheless, it was recognized that alternate datasets may provide better estimates of ice LWE (e.g., hourly surface NWS weather-type observations; Blunden and Arndt 2011).
The method used to estimate LWE accumulations was assessed by comparing the above approach with an alternate whereby LWE was accumulated over the three hours following an observation of ice (online supplementary Fig. 4). Differences were typically small, on the order of 0–0.1 in. (1 in. = 2.54 cm), but with a few exceptions, implying some degree of uncertainty in observed estimated LWE, which should be borne in mind with the validations shown in section 3.
Because the climatological statistics of freezing-precipitation frequency vary across the domain, station sites were aggregated for three subdomain regions, denoted R1–R3, as depicted by the color coding in Fig. 1. Station groups were delineated on the basis of the degree of covariability, the observed seasonal climatological behavior at each station, and the cross correlations of the seasonal time series of a given site with each other site (not shown). The orientation of these groups is representative of the typical orientation of the axis of freezing precipitation during winter storms in this region (e.g., Sanders et al. 2013).
The choice of algorithms to apply (NR-Alg) reflected a need to use computationally efficient techniques. Other alternate algorithms may be superior in distinguishing among phase types, but the consolidation of freezing rain and ice pellets for this work mitigates the requirement for a more sophisticated approach. In addition, use of the Baldwin and Contorno algorithm was replicated in NR-Cat, although differences are likely to arise between the two because of our modifications, any updates to the specific form of the algorithm in the Eta Model, and the direct application of the algorithm in the Eta Model at each time step versus our use of precipitation-type identification on the output data. Also, some computational uncertainty may have been introduced as a result of the conversion of Fortran-based algorithm code into NCAR Command Language.
In addition to the caveats resulting from estimating LWE, human error may be introduced through both the collection and the aggregation of the data. To mitigate the latter, the calculation of monthly ice counts and LWE was repeated three times to check for consistency of estimates. For the former, however, the switch from human to automated observations was apparent at each station, occurring at varying times between December 1992 and January 1994. The station was resituated, typically close to its original site, and this move, along with changes to instrumentation, yielded some inconsistency in the record (e.g., Schrumpf and McKee 1996). An apparent artifact of this change was a reduction in the number of freezing-drizzle observations, particularly in the western subdomain where freezing drizzle occurs more frequently (Cortinas et al. 2004). It is plausible that the human observer could note freezing drizzle when it was insufficiently heavy for an automated sensor to measure.
a. Multialgorithm differences
Differences in the climatic annual average freezing precipitation between individual algorithms in NR-Alg and NR-Cat are shown in Fig. 2 for 3-hourly counts and in Fig. 3 for LWE. Both figures also include the distribution of annual average counts and LWE, respectively, for each approach (Figs. 2f and 3f) for stations grouped into subdomains R1–R3. In general, the algorithms do not vary greatly from each another. Greater discrepancies would be expected if freezing rain and ice pellets were evaluated separately (e.g., Wandishin et al. 2005). The largest difference is in a reduction in annual counts with NR-Cat. In the northeastern region where freezing precipitation is most frequent, NR-Cat counts are typically 30%–50% lower than those for NR-Alg (Fig. 2e). In contrast, NR-Cat and NR-Alg show similar LWE, also maximizing in the northeast. The reason for the lower counts in NR-Cat requires further study, since in many cases the NCEP algorithm can overrepresent freezing-precipitation frequency (Wandishin et al. 2005). The Baldwin and Contorno algorithm applied separately tended to provide similar-to-higher counts and LWE than the multialgorithm mean (Figs. 2b and 3b). We suspect that these differences may, in part, result from the variations that were discussed in section 2d(1). The Bourgouin (2000) algorithm is generally more conservative in its estimates of freezing precipitation (Figs. 2c and 3c), whereas the partial-thickness method (Figs. 2d and 3d) increased counts over portions of the north and east. This method also overproduced freezing precipitation in the Rocky Mountains, which suggests that the elevation adjustment is limited for complex terrain. Differences between each method were typically smallest in central/southeastern Oklahoma and northern Texas. Further work would be needed to establish the degree to which each individual algorithm captures and improves the year-to-year magnitudes and variability of freezing precipitation as compared with the full ensemble (NR-Alg and NR-Cat).
b. Temporal correlations
We examined the ability of the NR-Alg and NR-Cat to replicate the temporal variability of the observations by month, boreal winter (DJF), and cool season for each of the 13 station sites by using simple linear Pearson correlation. The results are depicted by color-coded tabulations (on the basis of strength of correlation) in Fig. 4 for 3-hourly counts and in Fig. 5 for LWE. This measure indicates the degree to which the observations and NARR-derived data covary. Observed outliers were assumed to represent ice counts and/or accumulations associated with significant ice storms. For all station-to-NARR-grid comparisons, described here and in later sections, small spatial displacements are permitted, by defining a small spatial neighborhood that consists of the nearest latitude/longitude point to the station site ±0.3° (approximately 1 grid point in each direction in x and y coordinates). Using this neighborhood, two definitions are applied: 1) the “grid average,” (average ice count/accumulation over the neighborhood) and 2) the “grid maximum,” (maximum ice accumulation/count over the neighborhood). The latter estimates the maximum NARR value proximal to the observation station site. This is particularly useful for freezing-precipitation zones that are narrow and/or convective, where the grid average would penalize against slight spatial displacement in NARR.
For counts (Fig. 4), NR-Alg performs best during December and January—typically the climatological peak months for freezing precipitation. More than 50% of the sites exhibit strong (≥0.7) cold-season and boreal-winter correlations, including locations such as Oklahoma City, Oklahoma; Tulsa, Oklahoma; and Springfield, Missouri (R2), and the southern portion of R1. Correlations are more mixed for R3, although an interesting result is that highly infrequent late-winter (March) ice counts appear to be well represented in this subregion. For many stations, correlations for NR-Cat are lower than those for NR-Alg. These differences were most evident during the early and late cool-season months for stations such as Wichita, Kansas; Dallas, Texas; and Tulsa. Nonetheless, in most cases, these differences were not statistically distinct, as determined by a two-tailed unpaired test of correlations. Boreal-winter correlations degraded for Amarillo, Texas; Dallas; and Waco, Texas, and improved for Oklahoma City and Fort Smith, Arkansas, using NR-Cat.
For LWE (Fig. 5), correlations of both NR-Alg and NR-Cat with observations were generally stronger for most stations when compared with the same months and seasons for counts. The majority of R1 and R2 stations had strong correlations through the peak of boreal winter. Differences between NR-Cat and NR-Alg showed mixed results, with only small and insignificant changes for most station sites. In addition, for both counts and LWE the grid maximum and average assumptions for ice did not yield notable differences in correlation. The results suggest that both NR-Alg and NR-Cat generally have comparable variability on aggregate over multimonth periods but diverge (particularly in counts) for some sites during individual months.
c. Subdomain climatic behavior
1) R1 (northwest)
Figure 6 shows the R1 time series of 3-hourly counts (Fig. 6a) and LWE (Fig. 6c), annually by boreal winter, in comparison with the observations for NR-Alg only (time series for NR-Cat are provided in online supplementary Figs. 5–7 for R1–R3); Figs. 6b and 6d show the box-plot distributions of all data points in R1 for observations, NR-Alg, and NR-Cat. The time series observations of 3-hourly counts are aggregated into total freezing precipitation, consisting of observed freezing drizzle, ice pellets, and freezing rain, and into freezing precipitation excluding freezing drizzle. For both variables, NR-Alg grid maximum and average freezing precipitation is a good reproduction of the observed interannual variability in total freezing precipitation. This is particularly true for LWE (correlation coefficient squared R2 = 0.82). NR-Cat also reproduces the time series but with reduced magnitude and a slightly lower R2 (0.79). In some years, the grid-maximum approach provides a better magnitude, and in other years the grid average is better—potentially a result of the swath and orientation of freezing precipitation with respect to a given station. The decreasing trend in annual 3-h-period counts for observed total freezing precipitation (including freezing drizzle) is likely related to a switch in instrumentation between 1992 and 1994, as discussed in section 2d. By comparison, there appears to have been little trend in the concurrent annual frequency of freezing precipitation excluding freezing drizzle, as well as the magnitude of LWE (Fig. 6c). The distributions of all nonzero occurrences of these variables accumulated over each year (November–April) and station (Figs. 6b,d) suggest that the NR-Alg (and NR-Cat in particular) grid average on aggregate underestimates peak counts and LWE while the grid maximum produces more representative magnitudes and spread. The magnitudes of NR-Alg yield better results than did NR-Cat for R1, particularly for counts.
The monthly and seasonal climatological results (1979–2013) in Fig. 7 reflect some of the observed biases in NARR-derived ice statistics. To assist the interpretation, bootstrapped confidence intervals (significance p value = 0.05) from 500 replications of the sample mean are depicted as error bars. When compared with observed counts (Fig. 7a), the grid-maximum assumption best represents the mean magnitude for both NR-Cat and NR-Alg, with the smallest bias for the latter. No month or season is statistically distinct from the observed counts, with the exception of cool-season and boreal-winter frequencies for grid-average NR-Cat, which are significantly lower than observed. For LWE (Fig. 7b), the grid average of NR-Alg gives the best magnitudes for each month.
2) R2 (central)
The R2 time series and distributions of monthly and seasonal climatological results are shown in Figs. 8 and 9. Once again, the time series of counts (Fig. 8a) from NR-Alg (and NR-Cat, not shown) reproduces the observed variability and trends. The R2 values remain high and increased relative to R1 for both NR-Alg and NR-Cat counts while remaining similar for LWE. Nonetheless, this covariability does break down in some years, 1981–85 and 1998, for example. The latter case may be related to NARR simulation of freezing drizzle, because NR-Alg more closely follows the magnitude of freezing-rain and ice-pellet counts. In R2, the grid maximum overestimates counts in most years. The LWE estimates (Fig. 8c) show a better fit in magnitude and replicate the observed (although not statistically significant) increasing trend and high interannual variability, particularly after 2000. The LWE grid maxima overestimate observed magnitudes for most years with generally low ice accumulations but match almost exactly the magnitudes for seasons with high ice accumulations, such as 1987/88, 2000/01, and 2007/08. The winter-season distributions for counts (Fig. 8b) show the grid average of NR-Alg and grid maximum of NR-Cat to have dataset ranges and interquartile ranges that are similar to those observed. For LWE (Fig. 8d), both NR-Cat and NR-Alg grid averages have similar distributions but slightly reduced interquartile ranges relative to observations. The grid-maximum approaches tend to overestimate spread in the distribution for both variables, particularly outlier (heavy-ice month) values using NR-Cat.
The monthly and seasonal climatological results (Fig. 9) reproduce magnitude and seasonality well for peak counts (Fig. 9a) and LWE (Fig. 9b) throughout the winter, notably for NR-Alg (grid average), although no month or season for either approach was significantly distinct from each other or observations. The grid-maximum assumption overestimated counts for NR-Alg and LWE for both NR-Alg and NR-Cat.
3) R3 (southeast)
The most marked result for R3 was the substantial overestimate of ice counts when compared with observations, particularly with the grid-maximum assumption (Fig. 10). This count bias was most pronounced prior to 1985 and during the winter of 2000/01, shown for NR-Alg (Fig. 10a). NR-Cat (online supplementary Fig. 5) reduced this high bias but did not remove it. The observations suggested typically low numbers of ice counts for most years and little difference between estimates that included and excluded freezing drizzle (with certain exceptions, such as 1981–85 and 1991), implying that the majority of R3 ice events take the form of freezing rain and/or ice pellets. There is no strong evidence of a trend in ice frequencies over the 35-yr record, but certain years exhibited large positive departures in ice frequency. One such example was the winter of 2000/01, when major ice storms impacted northeastern Texas through Arkansas on 12–13 December 2000, 24–28 December 2000, and late February 2001 (NCEI Storm Data). NR-Alg and NR-Cat showed a greater frequency of ice during this winter by a factor of 3 when compared with observations. This was primarily associated with NARR shifting the axis of heaviest freezing precipitation westward into the Dallas–Fort Worth metropolitan area during the 24–28 December event (e.g., Fig. 12, described below). Some of the discrepancy could also result from the observation being based on a singular temporal measurement, which may actually underestimate true duration, particularly if an observation was not recorded (which is a possibility with adverse conditions). In NARR, ice occurring between two forecast times can be identified from the accumulated 3-hourly precipitation between two time points in the presence of subfreezing conditions. Given this assumption, overestimates for ice frequency and magnitude are likely, possibly enhanced by allowing the surface temperature threshold to slightly exceed zero, which would have more substantial implications for precipitation in regions near the reanalysis 0° isotherm. Region R3 may be particularly susceptible because of its propensity for infrequent but long-duration ice events (e.g., Changnon 2003; Changnon and Karl 2003) in a location where the depth of cold air is often shallow. In contrast, the magnitude of NARR-derived LWE (Fig. 10c) is better constrained for the 2000/01 winter but is notably overestimated for 1981–85 (less so for NR-Cat). The meteorological conditions during that period and occurrence of freezing precipitation were not examined and remain a subject for further evaluation. The distributions of counts and LWE for stations within this subdomain (Figs. 10b,d) suggest that grid-average NR-Cat provides superior estimates of the spread and skew for both parameters when compared with NR-Alg.
Monthly and seasonal climatological results for R3 (Fig. 11) again show good reproduction of the observed seasonal timing of freezing precipitation, albeit with higher model-based counts (Fig. 11a) and LWE (Fig. 11b) in each case. The overestimated counts are statistically distinct from observations for cool-season and boreal-winter grid maximum. For LWE, NR-Cat, and NR-Alg, grid-maximum amounts are significantly greater than were observed for the cool season and boreal winter. The substantial overestimates using a grid-maximum approach imply that the typical morphology of simulated ice events may be spatially expansive and less sensitive to displacements in location. Broad storm morphology may be aided by abundant moisture and low-level warmth from the nearby Gulf of Mexico (e.g., Changnon 2003; Mullens et al. 2016). The grid average still markedly overestimates, implicating the reanalysis thermal and moisture parameters in the oversimulation of mixed-phase precipitation.
d. Spatial distribution
NARR-gridded freezing precipitation may be used in specific case studies and impact analyses, and therefore it is important to determine the degree to which this dataset can reproduce the spatial location and intensity of specific winter-storm events. A first test compiled a county-based map of ice coverage for two highly impactful winter storms—24–28 December 2000 and 8–11 December 2007—using reports from NCEI Storm Data, shown respectively in Figs. 12a and 12c. Storm Data archives notable weather events, including duration, NWS county zones, a qualitative description of impacts, and estimates of accumulations. We included counties that were affected by “glaze,” “ice storm,” “winter storm,” and “winter weather” when freezing rain or drizzle was mentioned in their description. Using the same events, an additional source of spatial information was the significant-event reports and graphics compiled from the affected local offices of the NWS. Reporting inconsistencies were possible in each case (e.g., Kovacik et al. 2010), but it is hoped that they are minimized by use of multiple sources and applicable keywords.
Figures 12b and 12d show the spatial coverage of these two events derived from NR-Alg (color shaded) and NR-Cat (black line). All grid points with at least one 3-h interval with ice were included. The observed LWE accumulations from station sites and estimated from NWS reports were overlaid, schematically representing the extent of freezing precipitation. Comparison of observed and NARR-derived spatial coverage showed good agreement in both cases, but reanalysis spatial coverage was more expansive for the December 2000 event, and the axis of heaviest freezing rain was shifted west over northern Texas (over 2 in. in Dallas–Fort Worth vs 0.5 in. observed). Farther northeastward, the magnitude of freezing precipitation is apparently underestimated by NARR. The December 2007 ice storm shows excellent agreement in the spatial extent of icing, and NR-Alg better captures the possible presence of ice in the Texas Panhandle. The NARR appears to have overestimated the extent of ice over far western Kansas, underestimated the magnitudes over central Oklahoma (particularly NR-Cat), and overestimated the magnitudes over eastern and central Kansas (primarily NR-Alg).
The second test of spatial performance for NARR-derived data used mPING crowdsourcing data as the observation for winter precipitation during 2013. Collected by NOAA’s National Severe Storms Laboratory, mPING gathers data primarily through a freely available reporting application for cellular “smart” telephones (Elmore et al. 2014). The public can submit observations of precipitation type, date/time, and location. Despite the possibility of erroneous precipitation typing by participants, Elmore et al. (2014) noted during quality control that precipitation types identified by the public agreed well with professional observations, showing temporal and spatial consistency. Although it would be useful to consider additional cases from other years, substantially fewer reports were available prior to 2013. Using event-total freezing-rain and ice-pellet (and mixtures thereof) reports, we estimated the spatial extent of three winter-weather episodes during 2013 and overlaid this information with NR-Alg (Fig. 13, shaded) and NR-Cat (Fig. 13, contoured) spatial estimates. Note that, while an mPING report is evidence that freezing precipitation occurred, the absence of a report at a location for which NARR indicated ice does not necessarily invalidate the reanalysis.
Figure 13 shows good agreement in the location of freezing precipitation for both approaches. NR-Cat shows more southerly extension for the 5–7 December event (Fig. 13b) and is more spatially constrained (expansive) for the 20–21 December (10 April) event (Fig. 13a). In both cases, for Fig. 13a, the NARR-derived data fail to show freezing precipitation over central and southern Oklahoma. The error resulted from the more northward position of the freezing isotherm in NARR surface temperatures (not shown). Timing and spatial errors observed in mesoscale models can be exacerbated for events with fast-moving, shallow, and sharp arctic fronts (e.g., Leatham et al. 2010), which was likely the case for this event. Simulation of cloud cover and advection may also influence whether the simulated subfreezing layer is erroneously eroded (e.g., Lackmann 2011, 210–214).
From this subset of case studies, it seems that the spatial accuracy of NARR freezing precipitation varies by event. For those shown, however, all but one evidenced reasonable spatial agreement with observations, and the overall swath and axis of accumulations were well constrained, albeit with differences in peak magnitudes. Events of differing intensities would need to be further examined for a more complete evaluation of spatial representativeness.
To conclude, we constructed a spatial climatological average for the south-central domain that was based on the combined average of NR-Cat and NR-Alg, as shown in Fig. 14. NR-Cat occasionally produced spurious counts of freezing precipitation over the southern United States (not shown), and therefore for this figure NR-Cat data were set to missing in areas where NR-Alg showed no ice counts. The spatial climatological distribution of annual average counts (Fig. 14a) agrees well with prior studies, including Changnon (2004), Changnon and Karl (2003), and Schultz (2001), that interpolated in situ observations of freezing rain. The estimates of annual average LWE (Fig. 14b) cannot be readily compared, since other datasets do not include this variable. LWE is also of limited utility to some users because of its lack of simple transference to ice accretion. The ratio of ice to snow events (Fig. 14c) follows a similar general pattern to that of Cortinas et al. (2004), where freezing precipitation is increasingly the dominant, albeit infrequent, winter-weather type toward the Gulf of Mexico coast.
e. Heavy-ice events
A final test of the veracity of the dataset is in its ability to capture the extremes of freezing precipitation, which typically accumulate in economically damaging, and sometimes fatal, ice storms. To examine these extremes, daily (ending at 0000 UTC) ice LWE was estimated for each month in the top 20% of observed monthly total accumulations, and these magnitudes were compared with those from the same set of months from NARR (Little Rock, Arkansas, data were not included because of incomplete subdaily records of LWE). We assumed that damaging ice events were positively associated with higher LWE. LWE does not equate to ice accretion, however, and there were exceptions, such as an April 2013 ice event across portions of Oklahoma (Fig. 13c) that had only minor impacts despite having nearly an inch of LWE. Air temperatures were close to freezing, and seasonal warmth preceded the event, limiting ice accumulation on structures and roads. Differences in temporal precision and magnitude between observed and simulated events were identified by calculating 2 × 2 contingency-table metrics commonly used for forecast verification [e.g., the glossary of forecast verification metrics from National Weather Service (2007)]. For accumulation thresholds ranging from 0 to 1 in. day−1, correct simulations (“hits”), missed events (“misses”), and incorrect simulations (“false alarms”) of freezing precipitation were compiled for each station and are depicted as color-coded frequency-bias tables and hit rates in Fig. 15
. Because these events are among the rarest in the distribution, the sample size for a given station was typically low, particularly at the highest thresholds. Nonetheless, this approach can detect instances of over- or undersimulation of these events or displacements in their timing.
Oversimulation of freezing precipitation at the lowest threshold was evident (bias > 1.5), particularly in R2 and R3 in both grid-average and grid-maximum data, but these biases were reduced in NR-Cat relative to NR-Alg (Fig. 15). NARR, particularly NR-Alg (Figs. 15a,c), produced more days with freezing precipitation (often very light accumulation), leading to high false-alarm rates for some stations in addition to high hit rates. The best discrimination for the majority of stations was the 0.1-in. threshold. The grid maximum (Figs. 15b,d) also improved magnitudes of simulated events for most stations up to the 0.25-in. threshold. Above this threshold, individual ice-event magnitudes were generally undersimulated (bias < 1), and in some cases NARR produced events that were not reflected in observations or departed in magnitude significantly. Often (~50%), this was within 24 h of the observed event, where NARR spread accumulations over a preceding or succeeding day, indicating issues with timing (not shown). Sometimes, however, freezing-precipitation days in NARR were well separated from any observed event, potentially resulting from spatial displacement of a nearby event in the model. Toward the upper end of precipitation magnitude (>0.75 in.), fewer stations had any data, but, for those that did, the NARR struggled to capture the timing of these events (low hit rates). NARR-Cat (grid maximum; Fig. 15d) performed best in this category. NR-Cat outperformed NR-Alg for R1 and R2 in particular, in its linear association with observations (higher R2 values) as based on aggregating all stations in each region. Some of the more fundamental biases (timing, location) were intrinsic to NARR and were reflected in both approaches. Much of the improvement for NR-Cat was in the reduction of false alarms. The application of precipitation-type algorithms to more temporally coarse output data in NR-Alg may have resulted in overestimates of events associated with rapid temperature changes. NR-Cat avoids this issue because precipitation type is assessed directly within the model.
4. Discussion and summary
We performed a detailed evaluation of the performance of a multialgorithm method to resolve freezing precipitation using NARR data, comparing and contrasting its performance with several first-order station sites, spatial data, and NARR-categorical freezing precipitation. A multialgorithm method to derive freezing precipitation was applied in addition to validating the NARR-categorical estimates. The methods used, while adjusted in part to suit the NARR and its specific biases, were intended to be largely transferable to other model and reanalysis datasets. The use of more than one algorithm to form a “miniensemble” incorporates different ways of representing precipitation type. The most prominent differences between these methods were likely to be in the partitioning of freezing rain and ice pellets, which was not explicitly investigated. The NARR-categorical approach is available as part of the NARR suite of variables and had not been comprehensively evaluated, although a prior study (Blunden and Arndt 2011) set the framework for analysis that was presented here.
On the basis of our results, both approaches produce good agreement in the climatological normal, time series, and trends of freezing precipitation for 1979–2014. They also replicate the magnitude and spatial range of selected high-impact ice storms. Differences between the approaches were mixed and were generally insignificant, with no clear “winner.” This result implies that NR-Alg is able to reliably determine the presence of freezing precipitation, in most cases with accuracy that is similar to that of NR-Cat. The largest differences with respect to observations resulted from the reanalysis representation of surface temperatures and precipitation accompanying winter weather, as based on the consistent sign of the bias for all algorithms. This bias tended to yield underestimated ice counts and LWE in the northwest, making a transition to overestimates in the southeastern subdomain. The NARR-categorical data better resolved the magnitude of heavy-ice days, particularly in the northwest and central subregions. The multialgorithm mean represented the temporal variability of counts and their magnitude better for several stations, in particular in R1. The use of a specific form of this dataset will ultimately depend on the application. Categorical precipitation type is readily available via the Earth System Research Laboratory website. The data compiled for this work are likely to be made available via an Internet portal (under development).
It is notable that NARR does not resolve the freezing-drizzle maxima over the southern high plains (e.g., Cortinas et al. 2004; Schultz 2001). The microphysical mechanisms that support freezing drizzle often differ from those for freezing rain (e.g., Bernstein 2000; Rauber et al. 2000, 2001; Huffman and Norman 1988). Freezing drizzle can form through supercooled warm-rain collision–coalescence in the absence of a warm layer, produces little or no measurable precipitation, and is associated with weak vertical motion. The ability of the NARR Eta Model to capture these types of environments is limited; A. Nichols et al. (2015, personal communications) examined the performance of the NARR-categorical approach to several light-icing events and found very little skill. This dataset subsequently cannot be used reliably in analysis of freezing-drizzle environments or their climatological characteristics.
Future work should examine the performance of the individual algorithms used. This research would include sensitivities of the freezing-precipitation data to changes in certain parameters (e.g., surface temperature threshold or critical thickness) and also would look at a disaggregation of freezing precipitation to freezing rain and ice pellets. The planned application of NR-Alg will be to evaluate whether regional climate models, such as those of the North American Regional Climate Change Assessment Program (Mearns et al. 2009), are able to accurately represent the spatial and seasonal climatological statistics (compiled from NARR) over their historical period, with the intention to ascertain future projections of freezing precipitation in the south-central United States. This work has shown the utility of the multialgorithm approach, but its performance is ultimately constrained by model-simulated thermodynamics and moisture. Some of the applied modifications that were based on NARR biases may be counterproductive for other models, limiting the generalizability of some aspects of our method.
Other planned work includes utilizing the dataset to determine possible statistical relationships between large-scale circulations regimes and freezing-precipitation variability. The snowfall counts and statistics estimated by the NARR-algorithm and NARR-categorical methods will also be validated. It is expected that this dataset will be helpful for weather and climate research and for transportation, energy, and other sectors that require climatological information for hazard assessment and planning.
This work is supported by grants from the Southern Plains Transportation Center (SPTC-105336600), the Vice President for Research at the University of Oklahoma, and the South Central Climate Science Center (USO‐122799700). General information on the precipitation-type algorithms was obtained from guidance materials developed by the NWS Warning Decision Training Division. The authors acknowledge Addison Nichols, Paul Goree, and Meaghan Allen for their contributions to the spatial freezing-precipitation maps in Fig. 12.
Supplemental information related to this paper is available at the Journals Online website: http://dx.doi.org/10.1175/JAMC-D-16-0180.s1.