Lake-effect snow (LES) is a cold-season mesoscale convective phenomenon that can lead to significant snowfall rates and accumulations in the Great Lakes region of the United States. While limited-area numerical weather prediction models have shown skill in prediction of warm-season convective storms, forecasting the sharp nature of LES precipitation timing, intensity, and location is difficult because of model error and initial and boundary condition uncertainties. Ensemble forecasting can incorporate and quantify some sources of forecast error, but ensemble design must be considered. This study examines the relative contributions of forecast uncertainties to LES forecast error using a regional convection-allowing data assimilation and ensemble prediction system. Ensembles are developed using various methods of perturbations to simulate a long-lived and high-precipitation LES event in December 2013, and forecast performance is evaluated using observations including those from the Ontario Winter Lake-Effect Systems (OWLeS) campaign. Model lateral boundary conditions corresponding to weather conditions beyond the Great Lakes region play an influential role in LES precipitation forecasts and their uncertainty, as evidenced by ensemble spread, particularly at lead times beyond one day. A strong forecast dependence on regional initial conditions was shown using data assimilation. This sensitivity impacts the timing and intensity of predicted precipitation, as well as band location and orientation assessed with an object-based verification approach, giving insight into the time scales of practical predictability of LES. Overall, an assimilation-cycling convection-allowing ensemble prediction system could improve future lake-effect snow precipitation forecasts and analyses and can help quantify and understand sources of forecast uncertainty.
In the Great Lakes region of the United States, lake-effect snow (LES) is a common cold-season mesoscale phenomenon, often accompanied by intense snowfall rates and accumulations, which accounts for a large portion of the seasonal precipitation (Jiusto and Kaplan 1972; Niziol 1987; Kristovich and Steve 1995; Veals and Steenburgh 2015). LES occurs when moisture and sensible heat flux from a warm lake surface destabilize a cold Arctic air mass above (Kristovich and Laird 1998; Markowski and Richardson 2010), which can result in shallow convection. This convection takes many forms, but very often it develops into banded features characterized by the wind direction, lake orientation, and large-scale environment (Niziol et al. 1995). These bands result in sharp gradients of precipitation that prove challenging to forecast accurately at short- to medium-range lead times.
The synoptic environment is a significant influence on LES events; in the eastern Great Lakes, LES is commonly triggered during cold-air outbreaks following the passage of wintertime cyclones (Wiggin 1950; Niziol 1987). Some of the earliest forecast guidelines that incorporate the synoptic environment, such as the temperature difference between the lake surface and the 850-hPa level (Niziol 1987), were moderately successful at predicting the likelihood of an LES event. More recently, though, the shape, structure, and precipitation of these storms have been shown to have strong sensitivities to surface and smaller-scale features. For instance, surface wind speeds and lake fetch can very often predict the morphology of an LES event (Laird et al. 2003; Laird and Kristovich 2004). Ice cover strongly affects these storms by controlling heat fluxes from the lake (Cordeira and Laird 2008; Gerbush et al. 2008) and by modifying the low-level winds through frictional effects (Wright et al. 2013), changing the precipitation field. Regions of complex topography can affect LES events by altering wind fields and creating local orographic lift (Onton and Steenburgh 2001; Alcott and Steenburgh 2013). This can result in localized precipitation enhancement (Veals and Steenburgh 2015), although the mechanisms associated with this are still under investigation (Minder et al. 2015; Campbell et al. 2016). It follows that an accurate LES precipitation forecast needs to account for large-scale synoptic forcing, as well as local features and mesoscale variables.
As a result, regional mesoscale models have been successfully used for some time in real-data cases to simulate LES events (e.g., Ballentine et al. 1998), benefiting from advanced physics and higher model resolution. However, model error often hinders forecast accuracy. For example, Ballentine and Zaff (2007) observed a consistent southern bias in their simulations of eastern Great Lakes LES events that appeared to be linked to the choice of dynamical core. Different choices of model physical parameterization schemes, such as microphysics (Theeuwes et al. 2010; Reeves and Dawson 2013; McMillen and Steenburgh 2015) and boundary layer schemes (Conrick et al. 2015) show significant differences in the locations and amounts of precipitation in short-term LES forecasts. Thus, there is considerable forecast uncertainty associated with LES stemming from multiple sources that cannot be conveyed by a deterministic forecast alone.
Regional convection-allowing model ensembles have been shown to have skill in short-range forecasts of warm-season convective events (e.g., Clark et al. 2009; Schwartz et al. 2010), with the benefit of providing probabilistic guidance based on forecast uncertainties as well as the advantages of higher model resolution. The introduction of data assimilation to these ensemble forecasts has also shown to improve skill by reducing initial condition errors (Romine et al. 2013; Sobash and Stensrud 2015; Schwartz et al. 2015a,b). However, the optimal design of these ensembles is strongly dependent on the phenomenon simulated as well as the background forcing (Stensrud et al. 2000; Clark et al. 2008). Arnott (2010) concluded that an ensemble prediction system may be a useful forecast tool for a lake-effect event but that ensemble design should be more thoroughly explored. With this in mind, this paper designs and evaluates a regional convection-allowing ensemble for lake-effect snow, with the goal of improving forecast accuracy while capturing and understanding the sources of forecast uncertainty.
In the first part of this paper, several ensembles are designed and then the corresponding error, spread, and precipitation forecast characteristics are compared for a Lake Ontario LES event. In the second part, the impact of data assimilation is investigated in order to provide some insight into the predictability of LES, the forecast sensitivity of these storms to initial conditions, and the practical forecasting challenges of using an ensemble prediction system for this unique mesoscale event.
2. Data and methods
Lake-effect snowstorms have lifetimes ranging from hours to several days. To evaluate the predictability of LES at these time scales, we choose to simulate a relatively long-lived event that impacted the eastern shores of Lake Ontario and the Tug Hill Plateau region of upstate New York during 10–12 December 2013. The event initiated after the passage of a trough eastward across the lower Midwest and Great Lakes set up colder west-northwesterly flow over the eastern Great Lakes; a more detailed overview of the synoptic and mesoscale evolution of this event can be found in Campbell et al. (2016). This particular snowstorm was characterized by a long-lake-axis parallel (LLAP) snowband (Fig. 1) that produced snowfall from early on 10 December until early 12 December. It was a high-precipitation event that was well observed and documented, as it occurred during the field campaign part of the NSF-sponsored Ontario Winter Lake-Effect Systems (OWLeS; Kristovich et al. 2017) project. In situ observations at North Redfield on the western Tug Hill Plateau recorded average snow accumulations of 3–5 cm h–1 over several consecutive 6-h periods during the height of the storm and snowfall totals exceeding 80 cm (Minder et al. 2015; Campbell et al. 2016).
b. Simulations and ensemble design
The model used is the Weather Research and Forecasting (WRF; Skamarock et al. 2008) Model, version 3.7, with the ARW dynamical core. Three one-way nested domains are used (Fig. 2): a 27-km coarse outer grid that includes the entire Great Lakes region, a 9-km grid, and 3-km inner grid centered on Lake Ontario, all with 43 levels in the vertical with a model top of 50 hPa. Grell 3D cumulus parameterization is used on all but the innermost domain. Lake surface temperature lower boundary conditions are taken from the National Centers for Environmental Prediction (NCEP) 0.083° real-time global sea surface temperature analysis product. Lake Ontario had little ice at this time, so lake ice is not prescribed beyond that provided by the initial condition source. Forecasts are initialized 1200 UTC 9 December and run until 1200 UTC 12 December; as the most significant precipitation began on 11 December, this allows us to examine ensemble characteristics at longer (36+ h) lead times. Different ensemble designs, discussed below, are summarized in Table 1.
First, one ensemble (Table 1) is created using initial condition (IC) perturbations generated by the “CV3” background error covariance option in the WRF data assimilation (WRFDA) system (Barker et al. 2012). Here, initial perturbation amplitude and correlation length scales are related to climatological forecast error and variance derived using the National Meteorological Center (NMC) method (Parrish and Derber 1992), resulting in horizontal length scales O(1000) km. Perturbations are centered around initial and lateral boundary conditions derived from GFS deterministic forecast output. The WRF lateral boundary conditions are not perturbed in time. Physical parameterizations used in the model are the same for each of the 21 ensemble members and include the Thompson double-moment microphysics scheme (Thompson et al. 2008) and the MYJ boundary layer scheme (Janjić 1994).
Next, an ensemble is created by using perturbed initial and boundary conditions (IC/BC). While some methods exist for generating boundary condition perturbations for limited-area models (LAM) that vary in time and space (Torn et al. 2006), the simplest method is to use values from a global model ensemble. Here, each IC/BC WRF member receives initial and boundary conditions from one member of the NCEP Global Ensemble Forecast System (GEFS) for a total of 21 members. This incorporates different initial condition perturbations than the IC ensemble and provides flow-dependent boundary condition perturbations that evolve with time. All WRF members share the same physics configuration as the IC ensemble. It should be therefore noted that IC and IC/BC compare different methods of ensemble generation beyond the addition of perturbed boundary conditions.
Physics parameterizations can introduce model biases and errors as well. By varying one or more combinations of parameterization schemes, such as convective, boundary layer, surface layer, microphysics, and radiation schemes, an ensemble can be generated that can help understand the magnitude of the forecast uncertainty stemming from model error and lend insight into critical physical processes. These “multiphysics” ensembles have shown skill in forecasts of warm-season convective events, in some cases producing more spread in the short-term forecast than an ensemble with perturbed initial conditions alone (e.g., Stensrud et al. 2000; Clark et al. 2008, 2010). Model simulations of lake-effect events have shown strong sensitivities to the choice of microphysical and boundary layer schemes and combinations thereof (Reeves and Dawson 2013; McMillen and Steenburgh 2015; Conrick et al. 2015). In this PHYS ensemble, seven microphysics schemes [Goddard, WSM6, WRF double-moment 6-class microphysics scheme (WDM6), Thompson, Morrison, Milbrandt–Yau, and NSSL] and three boundary layer (with associated surface layer) schemes [Mellor–Yamada–Nakanishi–Niino 2.5 (MYNN2.5), MYJ, and Yonsei University (YSU)] are mixed to create 21 total members [see Skamarock et al. (2008) for information and references regarding physics schemes]. The initial and boundary conditions are taken from the same GFS deterministic forecast as the IC ensemble but are not perturbed. As a final component to this experiment, each unique physics combination is given to one member of the IC/BC ensemble to create a new “IC/BC PHYS” ensemble to examine any nonlinear contributions to ensemble spread.
c. Data assimilation
Data assimilation (DA) has been used for decades as a tool to improve numerical model forecasts by reducing initial condition error and involves statistically combining model forecasts and observations (Kalnay 2003). In particular, the ensemble Kalman filter (EnKF; Evensen 1994) is a DA technique that uses a flow-dependent background error covariance from a forecast model ensemble, which often improves the final analysis compared to other methods, such as three-dimensional variational data assimilation (3DVAR; e.g., M. Zhang et al. 2011; Miyoshi et al. 2010; Buehner et al. 2010), and has been successfully applied at global, regional, and even convection-allowing scales. The technique can also be used as a tool to understand forecast sensitivities, predictability time scales, and contributions of initial condition error to forecast confidence. In this experiment, after ensemble performance is evaluated and compared, one ensemble (from Table 1) is selected for EnKF to create a regional analysis system. This system will be used to lend insight into forecast and predictability questions that arise during this LES event.
The DA system used for this experiment is the Pennsylvania State University (PSU) WRF-EnKF system (Zhang et al. 2006; Meng and Zhang 2007, 2008a,b; F. Zhang et al. 2009, 2011; Weng and Zhang 2012). The filter used is a serial square root variant of EnKF (Whitaker and Hamill 2002). Ensemble perturbations are not inflated before assimilation, but ensemble analysis spread is relaxed to 80% of the prior spread. A fixed observation localization radius of influence is used to minimize spurious analysis increments as a result of undersampling (e.g., Greybush et al. 2011). The localization function applied to the background error covariance follows that of Gaspari and Cohn (1999) and uses a horizontal radius of influence of 500 and 1000 km for surface and upper-air observations, respectively. In the vertical, this radius is 15 model levels. Observations assimilated are those conventionally available from the NOAA Meteorological Assimilation Data Ingest System (MADIS), and include surface-, aircraft-, and radiosonde-based temperature, wind, pressure, and humidity observations. No radar products or satellite brightness temperatures are used. Observations from OWLeS are not assimilated in order to be used as an independent verification dataset. Details about the EnKF ensemble design and assimilation cycles are discussed in section 4a.
d. Verification datasets and metrics
Ensemble forecast performance is evaluated using several different metrics. The root-mean-square error (RMSE) of each ensemble is used to assess overall forecast accuracy. RMSE is evaluated using conventional observations from MADIS; specifically, surface and upper-air temperature and wind are examined because of their importance in LES morphology and forecasting. Ensembles are also compared with respect to spread characteristics, normally computed as the ensemble variance or standard deviation. Typically, spread is used as a measure of forecast confidence; for example, higher spread among ensemble members indicates larger forecast uncertainty. However, this measure can also be used to evaluate and compare forecast performance. An ideal ensemble should exhibit consistency (Wilks 2006), in which it correctly reproduces the actual forecast probability distribution. To reach this condition, mean RMSE should be roughly similar to the sum of ensemble spread and observation error (Desroziers et al. 2005). Likewise, as forecast uncertainty generally increases at increasingly longer lead times, so too should ensemble spread increase with forecast hour.
Quantitative precipitation forecast (QPF) performance is evaluated using both point and gridded observations. The NCEP/Environmental Modeling Center (EMC) Stage IV multisensor precipitation analysis (Lin and Mitchell 2005) is used for storm-total QPF evaluation across the spatial domain. Stage IV has limitations owing to the shallow depth of LES and poor detection of low levels from the local Montague, New York (KTYX), radar (Brown et al. 2007), as well as an underestimation of sometimes half the total precipitation during this event likely due to poor calibration of the reflectivity–precipitation rate relation (Welsh et al. 2016). However, Stage IV does provide some estimates of precipitation location, area, timing, and relative intensity with which to compare the spatial QPF.
Additionally, for this particular event, in situ manual snow depth and snow water equivalent (SWE) accumulation observations are available from OWLeS campaign researchers. Manual SWE observations at the North Redfield and Sandy Creek snow study stations were taken every 6 h, beginning late on 10 December and continuing until the end of the event, early on 12 December (Steenburgh et al. 2014a,b). These observations are used later to assess some aspects of the predictability of the sometimes intense precipitation documented on the Tug Hill during this event. Here, predictability is described as the ability to provide a confident and accurate forecast at a given lead time, in which an ideal ensemble forecast initially detects and then finally converges on a solution for an event; further discussion of predictability horizons for winter storms can be found in Greybush et al. (2017). The limits of predictability can be understood as having two parts: practical predictability, imposed by the limits of model and initial condition error; and intrinsic predictability, in which forecasts diverge even when given a nearly perfect model and the initial state is known nearly perfectly (Lorenz 1996; Melhauser and Zhang 2012). This study focuses on the practical predictability of LES precipitation, using a convection-allowing ensemble DA and forecasting system.
3. Ensemble forecast results
a. Ensemble design and performance
First, the free ensemble forecasts (no DA) are assessed for their performance in simulating the environment of this wintertime convective event. The time evolution of error (ensemble RMSE) and spread during the event for each ensemble is shown in Fig. 3. For clarity, spread is shown as ensemble standard deviation rather than total spread, as large observation error can mask important temporal characteristics. Forecasts are verified against conventionally available temperature and u-component wind observations, both critical quantities in being able to accurately forecast LES. Verification is separated between surface-level and lower-tropospheric (900–600 hPa) observations; the lower atmosphere is highlighted because lake-effect cloud is typically found at these levels.
In terms of error characteristics, no free run ensemble appears to perform consistently better than the others across all variables throughout the forecast period. Interestingly, all forecasts show sudden error growth in surface variables and winds around 0000 UTC 11 December, which coincided with an observed sudden enhancement in storm intensity and organization and the passage of a 500-hPa shortwave trough (Campbell et al. 2016). While there is some difference among error characteristics in the surface variables, that difference is also less clear aloft; the IC and PHYS ensembles appear similar, as do both ensembles with GEFS IC/BC. This is likely the result of the global model data used for initial and boundary conditions, as well as the choice of perturbations, which is highlighted in the ensemble spread characteristics below.
In terms of spread (Fig. 3), there are unique patterns among the ensembles. For instance, while IC was initialized with large variance, the ensemble spread rapidly declines during the event, unlike IC/BC, PHYS, and IC/BC PHYS. This collapse can be linked to the initial condition perturbations as well as to the use of unperturbed boundary conditions, the domain size, and the large-scale environmental forcing. The random initial condition perturbations introduced by the WRFDA CV3 option in IC (not shown) were of larger variance and had a different spatial structure than the perturbations in IC/BC; if inconsistent with the background dynamics, ensemble members could be expected to return toward the attractor (Lorenz 1996) and the perturbations would not be preserved in time (e.g., Gutierrez et al. 2008). This evolution points to the importance of introducing information about the “errors of the day” (Kalnay 2003) and tuning perturbation parameters accordingly for medium-range LES forecasts. Later, other limited-area model considerations became additional factors limiting the ensemble spread in IC. At the time of the event, a strong zonal midlevel jet with winds exceeding 50 m s−1 had moved into the domain, a synoptic situation common during LES. As a result, advective time scales across even the largest domain were shortened considerably. Back trajectories from the HYSPLIT model (not shown) show that tracers in the vicinity of the Ontario snowband had originated west of Lake Michigan, over 1000 km away, only 24 h prior. This strong boundary forcing meant that the homogeneous lateral boundary conditions from GFS could sweep through the domain (Warner et al. 1997), effectively wiping the initial condition perturbations, further retarding ensemble spread growth. These results suggest that for longer lead times beyond the short term, perturbed boundary conditions may be an important consideration in a limited-area model ensemble forecast during an LES event, and also indicate that large-scale flow from beyond the Great Lakes region plays an important role in defining the lake-effect environment and its uncertainty for lead times beyond one day.
While IC/BC, PHYS, and IC/BC PHYS retain and sometimes grow ensemble spread, there are distinct differences in both the vertical distribution and the temporal evolution between them. The three ensembles showed comparable amounts of forecast spread near the surface through the first 48 forecast hours, but this equivalence breaks down strongly in the lower troposphere. This is also shown in Fig. 4, which compares rank histograms (constructed as in Hamill 2001) using lower-tropospheric u-component wind observations over the forecast period. Ideally, a histogram should follow a uniform distribution (shown by the red line), indicating that the observations are predicted as equiprobable members of the ensemble (Wilks 2006). The root-mean-square deviation (RMSD) and chi-squared (X2) metrics are different measures of deviation from the expected value for each bin in a uniform distribution, and with larger values indicating increasing departure from “flat.” In the histograms, all ensembles show a peak in the lowest value, potentially indicating a high wind speed bias. However, the IC and PHYS ensemble histograms appear more right skewed and have much larger X2 values than IC/BC and IC/BC PHYS, suggesting that the latter two may have improved bias characteristics.
The IC/BC spread as a whole increases in time (Fig. 3), which is expected as forecast confidence decreases at increasingly longer lead times. Conversely, while the spread in PHYS is at some points initially comparable to IC/BC, there is minimal spread growth, especially away from the surface. The difference in ensemble spread characteristics between IC/BC and PHYS shown in both Figs. 3 and 4 is likely due to the conflation of several factors. The choice of physics perturbations (microphysics and boundary layer) in PHYS would have limited vertical scope in this case, as vertical mixing was capped by a strong wintertime inversion around 3 km above ground level; these perturbations would likely generate little variance above this level. Similarly, the synoptic signal largely controlled the evolution of this event; model simulations typically show less sensitivity to parameterization schemes when the large-scale forcing is strong (e.g., Stensrud et al. 2000). Thus, it was necessary to incorporate initial and boundary condition perturbations representative of the large-scale uncertainty to grow effective ensemble spread within our model domain during this event.
b. Precipitation evaluation
While the ensembles have differing environmental characteristics, it is not immediately clear whether their perturbations should lead to significant differences in the precipitation forecast. Figure 5 shows ensemble mean forecasts of liquid water equivalent precipitation on the east shore of Lake Ontario, accumulated from 0000 UTC 10 December to 1200 UTC 12 December. The Stage IV analysis is included as an estimate of the observed precipitation (Fig. 5f), and a deterministic WRF forecast initialized using the 1200 UTC 9 December GFS forecast is provided as reference (Fig. 5c). Overall, all forecasts have a precipitation field and a maximum that resemble the observed pattern, indicating that this was well predicted as an LES event. However, there are indeed differences. Most noticeable is the divide between the data sources used to initialize the ensembles: IC and PHYS have similar precipitation fields and carry the total precipitation maximum farther south than IC/BC and IC/BC PHYS. The IC/BC ensemble produces a spatially larger field of precipitation, and the addition of physics perturbations in IC/BC PHYS does not largely change this, although it does reduce the maximum value. While all ensembles forecast precipitation in the vicinity of the observed, it is unclear whether one ensemble QPF is significantly better or worse, owing to the limitations in the verification Stage IV product (particularly the underestimation bias observed during this event). However, it is apparent that initial and boundary conditions were influential in determining the distribution of the general shape and location of the ensemble mean precipitation field for this event.
The reason for the differences in the mean storm-total precipitation fields can be linked to ensemble dispersion. Figure 6 shows ensemble spread (standard deviation) in the storm-total liquid equivalent precipitation forecast. Most notable is the limited spread in IC (Fig. 6a), which would appear to be a confident forecast. However, PHYS (Fig. 6b) and IC/BC (Fig. 6c) illustrate the additional contribution to QPF uncertainty from sources not accounted for in IC. In PHYS (Fig. 6b), the largest values of spread are located near the ensemble mean maximum (Fig. 5b) on the western face of the Tug Hill Plateau, which could be the result of a bimodality among schemes regarding precipitation rate and spatial distribution in addition to disagreement about localized precipitation enhancement in that region. The results here from PHYS suggest that physics uncertainties still have impacts on the QPF by modulating precipitation intensity. In contrast, IC/BC (Fig. 6c) shows a wider swath of spread along the eastern lake shore as a result of differences in band location. There is an enhancement of spread south toward Oswego, New York, indicative of disagreement about the southernmost extent of the storm. The addition of physics diversity in IC/BC PHYS (Fig. 6d) captures the additional spread on the western Tug Hill that is apparent in PHYS (Fig. 6b) but does not largely alter the meridional extent of spread from IC/BC. Figure 6 shows that in terms of the QPF, both model error (sampled in the form of physics diversity) and synoptic-scale error contributed considerable uncertainty for this particular event.
4. Data assimilation and IC sensitivity
a. Ensemble selection and EnKF performance
Ensemble design is a critical concern for EnKF DA, as having incorrect or insufficient ensemble spread can strongly hinder the analysis. Using a global model ensemble for initialization provided ensemble spread growth coincident with a growth of large-scale uncertainty at longer lead times, which is important information for an ensemble DA system. PHYS provided consistent ensemble spread in environmental variables early in the forecast period, and it showed there is considerable uncertainty in how the model approaches QPF. However, it is difficult to investigate potential biases in multiphysics ensembles that may make physical interpretation of the forecasts more unclear, and members of a multiphysics ensemble may not be equally likely, which challenges assumptions made by EnKF. For these reasons, the IC/BC ensemble framework was selected to test the impact of regional EnKF DA and the influence of regional initial condition error on LES forecasts.
As background for the PSU WRF-EnKF system, 21 ensemble members are initialized at 1200 UTC 9 December 2013, drawing initial and boundary conditions from the NCEP GEFS. After an initial 12-h spinup, data are assimilated hourly starting from 0000 UTC 10 December until 1200 UTC 12 December. Observations within 30 min of the analysis time are included, and assimilation is performed on all three domains. While the domain and physics configuration are the same as in IC/BC, boundary conditions in this system are updated with those from the most recent GEFS forecast available at analysis time, which arrive every 6 h beginning 0000 UTC 10 December. As a comparison, we repeat this ensemble setup but instead do not perform updates with EnKF, allowing only the latest GEFS boundary conditions to update the ensemble (“BC update”) so that we can examine solely the impact of regional DA.
During EnKF cycling, the DA ensemble analysis generally shows reduced error at both the surface and the lower troposphere compared to the BC update ensemble (Table 2), indicating a positive impact. Improvements appear largest in surface variables, and although the impact is less substantial in the lower troposphere, comparisons using independent OWLeS radiosondes (Fig. 7) show improvements in temperature throughout much of the troposphere, including the reduction of a general warm bias (Fig. 7a). While EnKF improves near-surface wind errors when compared to these radiosondes (Figs. 7c and 7d), the effect is unclear aloft; less substantial changes here may be due to strong boundary forcing or the upper-air observation network, mainly consisting of aircraft data with prescribed observation error that may be artificially high (Benjamin et al. 1999). While prior work has shown improvements in analysis error by altering this value (e.g., Gao et al. 2012), this was not investigated during this experiment.
b. IC error and LES predictability
The predictability of this LES event is first considered by examining forecast uncertainty as a function of initial condition error and lead time. After every EnKF analysis cycle, a deterministic forecast is initialized using the analysis mean for initial conditions and taking boundary conditions from the control member of the most recent GEFS forecast available. These individual forecasts are then aggregated into a time-lagged ensemble, where each member represents a different initialization time. Here, predictability is evaluated as a number of members and thus lead times converge on a solution. Increasing ensemble spread would indicate reduced predictability resulting from initial condition error. To view precipitation forecast predictability, hourly accumulated precipitation on the Tug Hill Plateau is integrated for each member over a region shown by the box in Fig. 2. The temporal evolution from each forecast is shown in Fig. 8, illustrating differences in total precipitation depending on the forecast initialization time. Stage IV is included in Fig. 8 as an estimate of the temporal evolution rather than the absolute magnitude of precipitation.
First, we examine the start of the main organized LES event that occurred just after 0000 UTC 11 December. In Fig. 8, there is a convergence of solutions on the sudden rise in precipitation at this time, even from forecasts launched up to 24 h prior. The timing of the start of the event had considerably high predictability; however, older forecasts appear to initiate the event with a greater intensity. The end of the event around 0600 UTC 12 December had more limited predictability, as only forecasts initialized after approximately 0000 UTC 11 December capture the sharp decline. However, during the period from 0000 UTC 11 December through 0000 UTC 12 December, forecasts show considerable diversity depending on initialization time. The precipitation maximum around 1800 UTC 11 December is not well recognized by older forecasts, and forecasts initialized within 24 h prior to that event show large variation in the intensity of precipitation around this time, suggesting that predictability of the amount of precipitation during this period was limited. These results show that while this particular LES event may have been foretold at least 48–72 h in advance, variation in precipitation timing and intensity may have far shorter predictability time scales.
While this time-lagged ensemble is a useful tool, it does not necessarily display the confidence of any single forecast or the underlying, perhaps unknown, sources of uncertainty surrounding an event. In the limit of practical predictability, ensemble forecasts initialized closer to an event should grow increasingly more confident around the solution. To further examine this for our case, we initialize several ensemble forecasts from EnKF analyses at several lead times prior to the peak of the event on 11 December. Beginning 1200 UTC 10 December and continuing every 6 h through 1200 UTC 11 December, an ensemble forecast is initialized after the analysis cycle; each member draws initial conditions from its respective EnKF analysis and boundary conditions from the latest GEFS forecast. To explore the impact of the regional EnKF system, at each time a second ensemble is initialized using solely the latest GEFS for initial and boundary conditions without regional DA.
Figure 9 presents forecast 6-h accumulated liquid equivalent precipitation at Redfield (Fig. 9a) and Sandy Creek (Fig. 9b), New York, from these ensembles as compared to in situ observations from the OWLeS field campaign. The bar heights show the ensemble mean forecast, while the black error bars indicate the 90th (upper) and 10th (lower) percentile members. In each frame representing a 6-h accumulation period ending at the specified time, ensembles are grouped by their respective initial conditions: GEFS (blue) and EnKF (red). In each group, ensemble forecasts are ordered from oldest (left; earliest initialized 1200 UTC 10 December) to latest (right) and are included as they become available. BC update and EnKF analysis are ordered last; EnKF mean precipitation is calculated as the sum of the hourly forecasts between analysis cycles. The Stage IV estimate is given along with the observed, illustrating the large underestimation in the Tug Hill region during periods of more intense precipitation.
As seen in Fig. 9, during the middle of the event (periods ending 1200 and 1800 UTC 11 December), there is fairly consistent run-to-run QPF, and forecasts at Redfield (Fig. 9a) also appear to capture the trend in observed precipitation. Barring the spinup of the most recent GEFS-initialized ensemble, both GEFS- and EnKF-initialized forecasts appear to converge toward their respective analyses. EnKF forecasts perform better early in the period (ending 0600 UTC 11 December) at Redfield but overforecast at Sandy Creek; however, there is sizable ensemble spread in all EnKF-initialized forecasts, indicating increased forecast uncertainty at this time. Most notably, during the storm maximum at both locations during the period ending 0000 UTC 12 December, no single forecast recognizes and no group trends toward the observed precipitation, which was upward of 15 mm at both locations. While the EnKF analysis mean most closely matches the observed at both locations, the most recent forecasts initialized at 1200 UTC 11 December tended to perform poorly.
To provide some insight into the considerable underforecast from the most recently initialized ensembles, Fig. 10 shows the ensemble probability of radar reflectivity exceeding 15 dBZ at 1900 UTC 11 December, a point during the precipitation maximum period. For brevity, we show only results from BC update, the EnKF analysis, and forecasts initialized 0000 and 1200 UTC 11 December from both GEFS and EnKF members. Here, it is partially explained why the EnKF analysis more closely matches the observations in Fig. 9: the EnKF analysis had a cluster of high probabilities located inside the observed (black contour) (Fig. 10d) and therefore was confident in the correct band location. The 1200 UTC EnKF member–initialized forecast also had high confidence, but it was shifted south relative to the observed, and the relatively narrow width of the band at this time likely resulted in a miss, explaining the underforecast. However, the 1200 UTC GEFS-initialized forecast (Fig. 10c) did have higher probabilities near the observed, yet also suffered from an underforecast. Part of this can be explained by band structure: individual forecasts initialized from GEFS (not shown) produced a double-banded storm structure, while forecasts initialized from EnKF analysis members tended to produce the single-band structure that was observed (Fig. 1). As an additional explanation for the precipitation underforecast, Fig. 11 shows the 90th percentile accumulated precipitation during the 6-h period ending 0000 UTC 12 December, which presents some of the largest values of precipitation these ensemble forecasts expect. As shown in Fig. 11c, even the most extreme values of precipitation from the 1200 UTC GEFS-initialized forecast did not reach the values observed on the Tug Hill Plateau. In fact, the EnKF forecast initialized at this time (Fig. 11f) did, but it was likely position error (Fig. 10f) that led to its poor performance.
As shown in Figs. 10d and 11d, the EnKF analysis appeared to more closely reproduce the intensity and location of precipitation that was observed (Figs. 9a and 9b). However, the 1200 UTC EnKF member–initialized forecast, as discussed above, performed poorly despite its relatively short forecast lead time. To further investigate this, the LES band is identified in simulated and observed radar imagery. This is accomplished by applying an edge detection algorithm to identify strong reflectivity gradients and classifying closed contours of these gradients as LES band objects. An example of an identified object is given by the black contour in Fig. 1. Once an object is identified, attributes such as centroid, length, area, and orientation can be calculated. Table 3 presents a comparison of performance between the EnKF and BC update ensembles using some of these metrics. Figure 12a shows simplified LES band objects, created as lines that approximately reproduce the length, central location, and orientation of the band in each member of the EnKF ensemble at the same time as Figs. 10 and 11.
While it was well forecast that an LES band would be present, the central location and orientation at this time had considerable forecast uncertainty. This is shown in Fig. 12b, in which the LES band at 1900 UTC 11 December is identified in the time-lagged ensemble initialized previously from EnKF ensemble mean analyses. Lines are colored by initialization time. Figure 12b shows that the final location, orientation, and even inland penetration of the band were largely dependent on the initial conditions. More importantly, only the latest forecasts initialized after 1200 UTC (dark red) reproduce the skewed and more northerly band that was observed. To understand why, the band analysis is repeated as in Fig. 12b but instead objects are colored by average low-level (approximately 900 hPa) wind direction in the vicinity of the band (Fig. 12c). Here, it is illustrated that at this forecast hour, differences in wind direction were very likely responsible for the position and orientation error. Bands oriented more parallel to the lake axis were embedded in a westerly flow, while a more west-southwesterly flow produced LES that was centered farther to the north and skewed toward the northeast. Together, Figs. 12b and 12c suggest that the information used to improve the EnKF analysis was made available only recently before the period, indicating the sensitive dependence on initial conditions and potentially limited periods of predictability.
5. Summary and conclusions
A regional ensemble analysis and forecast system was developed to help understand the sources of forecast uncertainty during an LES event that occurred in December 2013. Ensembles using perturbed initial and boundary conditions and varying physical parameterizations (representing regional-scale, synoptic-scale, and subgrid-scale model error) were constructed and evaluated based on their ability to reproduce the LES environment and precipitation as well as their ensemble spread characteristics. The large-scale environment during this event was a source of very strong forcing, typical of eastern Great Lakes banded LES events, and it was found that initialization from a global model ensemble, which represented the synoptic-scale flow-dependent uncertainties, provided better ensemble dispersion in the short- to medium-range (1–3 day) forecast while giving comparable error performance. Varying physical parameterizations among ensemble members added to forecast uncertainty, with the largest impact on precipitation amounts rather than location.
The regional EnKF DA analysis improved both environmental error characteristics and the short-term precipitation forecast as compared to in situ observations. This sensitivity to initial conditions led to some exploration of predictability aspects of LES. Using deterministic forecasts launched from EnKF ensemble mean analyses provided a time-lagged ensemble that showed precipitation sensitivity as a function of initial condition error, and it was found that LES has modes of variability that may have limited predictability time scales. Another investigation using ensemble forecasts launched during the period showed that LES band position, orientation, and intensity biases can be linked to errors in initial conditions, sometimes leading to poor forecast performance even at short lead times. Object-based classification of LES bands provided an additional means of objectively comparing band position, length, and orientation, and likewise illustrated sensitivity to forecast lead time and wind direction.
Forecasting lake-effect snow is notoriously difficult because of the sharp gradients of precipitation that result. Additionally, as shown in this study, small changes in initial conditions and environmental variables (e.g., lower-tropospheric wind) can lead to largely different precipitation forecasts, especially at point locations. While reducing initial condition error could improve LES forecasts, a singular deterministic forecast does not provide information related to this forecast uncertainty. For this case, we show that uncertainty may have been fairly large, owing to the mesoscale nature of the event and the impact of large-scale uncertainty.
Overall, an ensemble DA–forecast system shows promise in improving short-term precipitation forecasts and could provide useful forecast information during potential high-impact LES events. As a forecast tool, we have shown that probabilistic guidance should include the effects of the large-scale features while still permitting the convective-scale features that are inherent in LES structures, but other sources of uncertainty (e.g., model error in QPF, lake surface boundary conditions) could still be incorporated. As a research tool, it is able to provide some insight into LES processes, and the short-term variability within the storm should be further investigated, especially with other cases and LES band types. Future work will seek to use ensemble sensitivity techniques and data assimilation to better understand and quantify the scales and sources of error and model biases. New data sources, such as radar, should be included to examine the implications for short-term forecasts and to evaluate how best to improve and extend the forecast skill of this regional ensemble analysis and prediction system.
The authors thank Fuqing Zhang, George Young, and Daniel Eipper for their helpful advice and discussion, and greatly appreciate the valuable comments from three anonymous reviewers. The authors are grateful to the Penn State Department of Meteorology and Atmospheric Science and the Institute for CyberScience for the computational resources used in this study, and to the NSF-sponsored OWLeS campaign for observational datasets.
This article is included in the Ontario Winter Lake-effect Systems (OWLeS) Special Collection.