The scope of this study is to assess a combination of well-known techniques for bias reduction and spatial interpolation in an attempt to improve wind speed prediction for storms on a gridded domain. This is accomplished by implementing Kalman filter (KF) for bias reduction and universal kriging (UK) for spatial interpolation as postprocessing steps for the Weather Research and Forecasting (WRF) Model. It is shown that for surface wind speed, a linear KF is adequate for eliminating systematic model errors with the available storm history. KF-estimated wind speed biases at station locations are then interpolated across the model domain using UK. The combined KF–UK approach improves the wind speed forecast median bias by 55% and RMSE by 15% (bulk statistics), while benefits obtained at station-specific locations can reach maximum improvements of 72% for RMSE and 100% for bias. Contingency statistics that inform on model performance over four categories of wind speed magnitude reveal that calm/moderate winds are successfully corrected but strong/gale winds cannot be adequately corrected by the combination of KF and UK, which is a disadvantage for improving prediction of severe storm conditions.
Despite constant improvement, numerical prediction of atmospheric processes, and especially extreme storm conditions, is affected by imperfect initial and boundary conditions, surface properties (topography, vegetation, soil moisture, and soil type), subgrid-scale phenomena, and planetary boundary layer representation (Pleim 2007; Mass et al. 2008; Hu et al. 2010; Wyszogrodzki et al. 2013; Frediani et al. 2016). Extreme storms are storms characterized by intense precipitation and pressure gradients and strong winds such as nor’easters, tropical storms, blizzards with strong sustained winds, thunderstorms as well as frontal systems that bring similar types of weather patterns. Accurate prediction of such storms is needed especially for the northeastern United States because of the impacts of strong wind speeds on the infrastructure and the environment, such as power outages caused by falling trees/branches and/or pole failure, among others.
There has been a long history of using statistical postprocessing techniques to improve Numerical Weather Prediction (NWP) such as modified Taylor–Kriging, kriging and vector autoregressive models, ensemble modeling, block-regression (Hamill 1999; Galanis et al. 2006; Louka et al. 2008; Roux et al. 2009; Cheng et al. 2013; Tse et al. 2014; Olaofe 2017, Liu et al. 2010; Fan et al. 2015; Scheuerer and Moller 2015; Zamo et al. 2016), mean bias removal (Hacker and Rife 2007; Wilczak et al. 2006), model output statistics (MOS) (Wilks and Hamill 2007; Glahn et al. 2009; Muller 2011), and Kalman filter (KF) (McCollor and Stull 2008; Rincon et al. 2010; Delle Monache et al. 2006, 2008, 2011). All of the mentioned studies deal with corrections of NWP models but not specifically on extreme storms, which is the main goal of this work.
The main objective is to assess the temporal and spatial error characteristics of the Weather Research and Forecasting (WRF) Model wind speed predictions for 107 storms that have impacted the northeastern United States, using a combination of KF for systematic error removal and universal kriging (UK) for spatial interpolation. KF has proven to be an effective tool for bias reduction of wind speed predictions (Galanis et al. 2006; Louka et al. 2008; Delle Monache et al. 2011). Bias is minimized with recursively updated weights generated from predictions and recent observations (Galanis et al. 2009). KF uses a short series of background information that is particularly relevant to the bias of predicted wind speed. Kriging is selected for spatial interpolation of estimated bias over other deterministic methods (trend surface analysis, inverse distance weighting, local polynomial, and thin plate spline) due to its successful application for daily mean wind speed in several studies (Luo et al. 2008; Joyner et al. 2015; Sliz-Szkliniarz and Vogt 2011; Zlatev et al. 2010). UK is specifically chosen due to the success in spatiotemporal models (Cellura et al. 2008; Qian et al. 2014). The application of combined KF and UK identifies advantages and limitations of the suggested approach for wind speed prediction under storm conditions.
The new aspect that this study brings to the scientific literature is the implementation of a KF–UK combined technique for surface wind speed prediction specifically for extreme storms that have impacted the northeastern United States from 2003 to 2014. This study addresses two wind speed prediction challenges: temporal model bias correction and adequacy of spatial data coverage given the limitation of observations in the modeling domain. Working with more than 100 storms during the 12-yr period is another unique aspect of this study in that it addresses the prediction of extreme storms and provides guidance on modeled wind speed error correction. For the remainder of the paper we will refer to the KF application as “KF bias corrected wind speed” and to the UK application as “kriged wind speed.”
Section 2 presents the observations and numerical weather prediction model simulations, section 3 includes the description of the methodology and statistical metrics employed for the evaluation of the results, section 4 includes discussion of the findings, and section 5 presents the main conclusions from this study.
2. Data and model simulations
Observations of wind speed at 10 m are obtained from the Meteorological Assimilation Data Ingest System (MADIS) (NOAA 2017; De Pondeca et al. 2011). MADIS ingests data from NOAA data sources and non-NOAA providers that undergo quality check (level 1 quality checks are considered the least sophisticated and level 3 are the most sophisticated; only level 2 or 3 data are used in this study with level 2 checking for internal, temporal and statistical spatial consistency while level 3 checks for spatial consistency). MADIS station availability over the WRF Model domain is shown in Fig. 1a. The use of MADIS data for WRF wind speed bias reduction has been seen in recent studies by Roux et al. (2009), Lorenzana et al. (2015), and Heath et al. (2016).
To maximize availability for the use of each filter, station data are analyzed for gaps of 5 h. If the gap is larger than 5 h from 0000 to 2300 UTC, the station is not used for the Kalman filter application or cross validation. If the gap in wind speed is less than 5 h the data are filled in with a linear interpolation between the two known endpoints, as recommended by the Environmental Protection Agency (EPA) (Atkinson and Lee 1992). Furthermore, Fig. 1a includes stations located over the ocean, which are not included in this study due to overall limited oceanic station coverage for the analyzed storms.
b. WRF simulations
Wind speed predictions are obtained from the WRF Model. WRF is a state-of-the-science mesoscale modeling system suitable for use in a wide range of atmospheric applications and scales, ranging from meters to thousands of kilometers (Skamarock et al. 2008; Michalakes et al. 2005). The WRF Model is applied to the northeastern United States to simulate past extreme weather events, with a total of 107 storms associated with high wind speeds, intense precipitation, and/or snowfall as well as tropical storms that affected the area during the period 2003–2014. Figure 1b displays the inner domain of the WRF predictions. Details regarding the WRF configuration are summarized in Yang et al. (2017) and also described herein. The model is configured with three horizontal grids at 18-, 6-, and 2-km grid spacing and 27 vertical levels up to 50 hPa. The National Centers for Environmental Prediction (NCEP) Global Forecast System (1° × 1°, 6-hourly intervals) analyses (NOAA/NCEP 2007) are used to initialize WRF. WRF utilizes the Thompson scheme for cloud microphysics (Thompson et al. 2008); Grell 3D scheme for convective parameterization (Grell and Dévényi 2002); Goddard for shortwave radiation (Chou and Suarez 1994); Rapid Radiative Transfer Model (RRTM) for longwave radiation (Mlawer et al. 1997); Noah for land surface scheme (Tewari et al. 2004); Yonsei scheme for the planetary boundary layer (PBL; Hong et al. 2006). The surface wind speed results for the inner domain (2 km × 2 km grid spacing) are used in this work.
Storm selection is based on the effects that those storms caused to the electric distribution grid infrastructure of utilities in the northeastern United States (from 20 outages to >15 000 outages). Storms are categorized using the probability distribution function of each atmospheric parameter (wind, temperature, pressure) and the ones selected exhibit strong winds, precipitation, and intense pressure gradients (thunderstorms, blizzards, nor’easters, and tropical storms) (Frediani et al. 2016). The duration of all storm simulations is 61 hours. The storm start and end time is decided in such manner that the simulation encapsulates the storm peak in the middle of the forecast timeline. The first 6 hours of the simulation are discarded as model spinup time.
a. Systematic error removal (KF)
KFs have been widely utilized in the literature as statistical post processing techniques to numerical atmospheric and oceanic prediction models for optimization and local adaptation of results (Crochet 2004; Galanis and Anadranistakis 2002; Galanis et al. 2006, 2009, 2011, 2017; Kalman 1960; Kalman and Bucy 1961; Kalnay 2002; Louka et al. 2008; Pelland et al. 2011). Having as main advantages the limited required resources in CPU time and memory as well as the dynamic adjustment to possible alternations of the prediction accuracy, KF provides a useful tool for the reduction of systematic biases in direct model outputs by combining recursive observations with recent forecasts.
The general Kalman filter algorithm simulates the evolution of forecasted time series (the state vector of the filter), which is associated with a known/observed second series . The evolution of the state vector in time is described by the system equation of the filter:
and the link between the observation and the state is given by the observation equation:
The matrices and are called the system and observation operators. The vectors and are independent Gaussian stochastic processes describing the white/nonsystematic noise-errors in the system and observation evolution. Therefore, the state and the measurement noise are assumed to have zero mean with covariances and , respectively. In particular, the matrix is diagonal since the observation error is assumed to be uncorrelated.
The Kalman gain coefficient is denoted i, which represents the relationship of state element uncertainty and measurement uncertainty. In the present work, the above general Kalman algorithm is utilized for the estimation of the bias of wind speed prediction as follows: given the initial numerical model output () at time ti and the corresponding observation value (obsi), the bias of the model is estimated as a polynomial function [Eq. (5)], which corresponds to Eq. (2) of the general Kalman algorithm:
The potentially nonlinear relation between bias and numerical model results is imposed by the nonlinear behavior of surface wind speed time series (Galanis et al. 2006). The coefficients (aj,i), j = 0, 1, …, n − 1, which are weighting the contribution of the linear () and the nonlinear terms () of the model output to the bias estimation in Eq. (5), form the state vector of the filter:
Additionally, the observation operator becomes , while the system matrix is the identity one resulting to a system equation of the form:
declaring that there are no a priory constrains for the evolution in time of the weighted coefficients of Eq. (5). This does not necessarily imply any steady state conditions for .
Because of the model’s need for self-evaluation in order to optimally predict the bias, initial values of parameters are specified for the first 24 h. This period is solely used to train the filter with the first 24 h of data. For the remainder of the available model–observation pair history, the KF continues training itself and applying the corrections to the corresponding forecast. The initial value of the state vector is consider equal to zero, assuming that the initial error of the forecast model is nonsystematic. This neutral value is imposed by the fact that no previous bias information is available and ensures that no risky preassumptions will be adopted. After very few integration steps the filter recognizes and estimates the bias.
The filter runs for a station’s time series of measured bias generated from observation–model pairs. The use of these extended time periods, apart from increasing the credibility of the statistical analysis, ensures also that the final performance of the proposed filters will not be seriously affected by the choice of initial conditions (Galanis et al. 2006). Since bias correction is additive, the KF bias estimate is given bounds in order to avoid negative forecast values similar to Delle Monache et al. (2006). The filter is able to account for the evolution of the bias at different times of the day (Delle Monache et al. 2006). Temporal updating of the Kalman algorithm occurs every time step for the entire station–WRF pair history for each location. Arrays of 0000–2300 UTC information will be described as “sets.” Please note that diurnal errors are not removed by the KF implementation.
The relationship between bias and model output in Eq. (5) is tested and results become significantly less stable when a second-order term is added. The duration and amount of instability increases further with a third-order term. Instability in second- and third-order bias functions is a result of the use of storm data for KF training. Similar higher-order applications of the KF to a history of model–observation pairs were conducted in Galanis et al. (2006) and Louka et al. (2008). Both of these studies ran the KF over a year’s worth of data with longer model forecast periods that were not specifically storms. Both of these studies utilize a third-order bias function for the application of sequential wind speed histories that follow the calendar year, as opposed to episodic-type of data of storm wind speeds that is used in this study.
b. Spatial interpolation (UK)
UK is applied to KF bias estimates and not directly to wind speed. The basic objective of kriging is to find the best linear unbiased estimate (BLUE) of field b(x) with n different observations in a field. Kriging interpolation at location x can be calculated using BLUE (Ryu et al. 2002):
where λi, i = 1, …, n, are weights obtained from statistical properties that the estimator must possess: unbiasedness and minimum variance that satisfy Eq. (9). The weights in Eq. (8) are determined according to either an exponential or spherical semivariogram model. Different forms of kriging require different first and second moment assumptions based on Eq. (8). The second term in Eq. (8) is included to demonstrate the difference in first moment property assumptions between kriging methodologies represented by the mean field . Universal kriging allows for trends in according to Eq. (10):
The UK mean field is determined by a regression calculated with p chosen functions [Eq. (10)] (Lophaven et al. 2002; Coakley et al. 2008). Note that p of the terms have been used to create the regression functions in the equation above, leaving np terms to describe deviations from the global trends (Cellura et al. 2008; Mardia et al. 1998). Coefficients βi, i = 1, …, p, are regression parameters corresponding to the regression function [Eq. (10)]. When first- and second-order regression as well as exponential and spherical semivariogram models were investigated, small changes were detected and no one combination was the best for both hourly and maximum surface wind speed (not shown). Thus, the results presented here are from the combination of regression–semivariogram that worked best for each application to reduce bias and error (for all hourly wind speed applications, results are shown using first-order regression and the spherical semivariogram model and for the maximum wind speed, second-order regression, and the exponential semivariogram model).
Because of the rather large area covered by a single grid point (2 km × 2 km), the UK algorithm cannot differentiate between very close stations. If two stations are less than 1 km apart, their time series for that storm are averaged. Note that the gridded kriged output matches the WRF resolution of 2 km. The UK algorithm used in this study is from DACE MATLAB Kriging Toolbox (version 2.0, 1 August 2002 Hans Bruun Nielsen Technical University of Denmark) and application to atmospheric data can be seen in (Coakley et al. 2008). By using bias estimates as inputs to Eq. (8), the UK algorithm analyzes the global anisotropic regression and local anisotropic spatial correlation behavior of the bias estimates hourly. The effects of adding spatial interpolation to temporal bias estimates is a topic of discussion in the sections that follow.
c. Description of the KF–UK combination
The steps to combine KF and UK are illustrated in Fig. 2. KF is trained at each station location starting with the first 24 h of data [T1; (W − O) → B in Fig. 2]. The KF is used to produce bias estimates to the corresponding station location (denoted B in Fig. 2) and improve the forecast of the next period 25–48 h (T2) on an hourly basis. Consecutively, the UK is applied to interpolate the bias estimates to the model domain at T2 and produce the expected bias, denoted E, in Fig. 2 for every hour of the forecast. WRF predicted wind speed W is added to the expected bias E to produce an adjusted wind speed forecast referred to as kriged wind R at all grid cells.
An example of the KF application (without UK) is shown for coastal Massachusetts at Boston Logan International Airport (KBOS station). Improvement of the filtered wind speed predictions is achieved after the KF is trained with the first 24 h. By the third set of 24-h data (72nd hour) the filter overcomes the systematic error from the initial conditions and removes the bias with improved consistency (Fig. 3).
UK performance is evaluated with a leave-one-out cross validation (LOOCV). The procedure involves removing each station from the hourly UK input independently (B in Fig. 2) and performing the kriging calculation to find the expected bias at that location (E in Fig. 2). The cross validation expected bias is directly compared to the corresponding estimated KF bias (B in Fig. 2) in order to determine how effectively UK reproduced the inputs. To accommodate the large amount of data, LOOCV is applied to 10% of the total storm sample size (eleven storms are randomly selected out of all the storms in the database). Figure 4 shows that UK interpolated biases are highly correlated to the original KF bias estimates at each station (linear correlation coefficient = 0.9) but UK introduces an underestimation of the KF bias, which is expected due to smoothing caused by spatial interpolation (UKbias = 0.67 × KFbias − 0.18; Fig. 4).
d. Statistical metrics
The effectiveness of the post processing methodology is quantified with traditional statistical metrics [bias and root-mean-square error (RMSE)] and contingency statistics, comparing the kriged wind speed (final product of the combined methodology; R in Fig. 2) with observed wind speed. RMSE is decomposed into bias and centered RMSE (CRMSE) to illustrate the systematic (bias) and random (CRMSE) component of the model error (Murphy 1988; Taylor 2001; Delle Monache et al. 2011). In this application, CRMSE does not change significantly (not shown) as the methodology primarily addresses systematic error correction and bias contributes more to the total error (RMSE). Mean bias and RMSE of WRF, KF bias-corrected, and kriged wind speeds are calculated according to Eqs. (11) and (12) for each station and storm. Obs is the observed wind speed and W the modeled wind speed. For evaluation of the entire population of all hourly pairs at all locations, median bias is measured:
Model performance for wind speed is also evaluated with contingency statistics (Wilks 2006; Lascaux et al. 2015) that provide relative accuracy metrics for discretized values of wind speed. Four intervals of wind speed magnitude following NOAA’s and WMO’s classification (Table 1) provide a scale analysis for light to very strong wind speeds. We calculate probability of detection (POD) that measures the proportion of observed events that have been correctly predicted by the model and the percent of correct detections (PC) according to Eqs. (13) and (14) and a 4 × 4 contingency table for each model output (WRF, KF, UK):
where N is the total number of events; a is the number of times the observed wind speed at interval 1 was predicted by the model at the same interval (hit; same for f, k, p); b is the number of times that the observed wind speed at interval 2 was predicted by the model as interval 1 (missed; similar for c, d, g, h, l); e is the number of times that the observed wind speed at interval 1 was predicted by the model as interval 2 (false alarm; similar for e, i, j, m, n, o).
4. Discussion of systematic error improvements for predicted wind speed
Hourly wind speed distributions are represented by density histograms (Fig. 5) for observations and model outputs (WRF, KF, UK). WRF overprediction can be seen in the upper tail of the distribution that is higher than the observed distribution. This results in less forecasted low wind speeds (less than 2.5 m s−1) as well as over representation of wind speed above 3 m s−1 compared to observations. KF increases the density of low wind speeds and better represents the location of the peak. The shift in the peak also results in improvement of wind speeds up to 9.5 m s−1. UK produces a narrower distribution since the spatial interpolation process results in a smoother field, with fewer extreme high and low speed values. Kriging by itself cannot improve the prediction; it depends on the WRF forecast error structure.
Wind speed mean bias (Fig. 6a) and RMSE (Fig. 6b) calculated at each station for each storm indicates reduction of systematic bias by a tighter bias distribution around zero for KF and UK compared to the original WRF simulation. KF and UK share very similar median, 25th percentile, and 75th percentile of the bias (0.42, −0.1, 0.98 for KF; 0.34, −0.3, 1.09 for UK) while WRF exhibits almost double the median and 75th percentile (0.87, −0.3, 2.22) (Fig. 6a). The bias outliers, denoted by open circles, reach −15 m s−1 and 10 m s−1 with WRF showing even larger positive bias than 10 m s−1. The RMSE distribution shows less improvement where all elements of the distribution (median, 25th, and 75th) are very close between the three models (KF, UK, WRF) with a slight decrease of the 75th percentile for KF and UK (Fig. 6b). Also, the maximum of the RMSE distribution is improved by KF and UK with the outliers remaining rather high (reaching 25 m s−1).
To better illustrate the effect of each approach, the relationship between observed and modeled wind speed forecasts are shown for WRF (Fig. 7a), KF (Fig. 7b), and UK (Fig. 7c). KF increases the slope of the linear regression from 0.50 to 0.76 and UK to 0.62. Some but not all of the improved variability of KF is able to be kriged to the whole domain due to the smoothing caused by spatial interpolation. These results are further illustrated by spatial maps of bias and RMSE for each station over the model domain (bias and RMSE are calculated for each station from all available storm data; Fig. 8). The application of KF successfully reduces biases, which also reduces RMSE, for the majority of the stations. When kriging the KF bias estimates, much of the bias removal is preserved at most stations showing that the suggested method is useful for hourly wind speed predictions.
Accurate surface wind speed time series and maximum predictions are equally important with accurate surface wind speed time series prediction (maximum wind speed is defined as the maximum value for each station and storm). The distribution of maximum wind speed bias reveals that post processing slightly reduces the spread of the maximum wind speed biases (Fig. 9a). While KF has the tightest bias distribution, all elements of the distribution (median, 25th percentile, and 75th percentile) are positive, which is indicative of poor filter performance. UK has a tighter bias distribution than WRF and reduces the minimum, maximum, and 75th percentile of the bias but the median bias (0.61 m s−1) is almost the same with WRF (0.7 m s−1). A comparison of the distributions of maximum wind speed also illustrates that there is not much improvement with the application of KF and UK (Fig. 9b). Bias and RMSE reduction with KF is not as efficient as it is for the hourly wind speed although the spatial maps of mean bias and RMSE for maximum wind speed show that KF is doing a better job compared to WRF (Fig. 10). UK preserves most of the KF features that show negligible improvement of the maximum predicted wind speed error reduction. This is a limitation of the specific post processing methodology that requires additional or even separate methods that can handle the upper tail of the wind speed distribution.
To further evaluate the performance of the combined KF and UK approach with regards to storm surface wind speed, contingency statistics in Tables 2–6 are discussed in this section. Wind speed intervals are taken according to NOAA and WMO as follows: 0–3 m s−1 (calm), 3–10 m s−1 (moderate), 10–14 m s−1 (strong) and >14 m s−1 (gale/storm). The PC is 66.2% for KF and 66.3% for UK whereas WRF exhibits a 60.8%. Since PC can be influenced by the most common interval, POD for each of the four intervals provides more information on the model’s ability to correctly predict each wind speed magnitude range. POD for interval 1 (calm) is highest for KF and UK (63.1% and 62.7%). POD for moderate winds is higher for WRF (77.2%) compared to KF and UK (71.3% and 72.3%). Strong breeze and gale winds are somewhat improved by KF, which is not retained by UK and gives values very close to the random case (a = b = ⋯ = p = N/16) where all POD would be equal to 25%. These results further support the previous findings that the KF–UK approach does not allow for improvements in the upper tail of the surface wind speed distribution.
Accuracy in storm wind speed prediction is important due to impacts caused by high wind speed storms in the infrastructure, environment and daily life. This study addresses two wind speed prediction challenges: temporal model error correction and adequacy of spatial data coverage given the limitation of available observations in the model domain. Working with more than 100 storms (from thunderstorms to hurricanes) during a 12-yr period (2003–14) is another unique aspect in that it addresses the prediction of extreme storms and provides guidance on modeled wind speed error correction. A combination of KF for bias correction and UK for spatial interpolation is selected based on the wide successful use of such techniques in atmospheric sciences and weather prediction. This methodological approach allows only improvement of the systematic error of surface wind speed prediction.
The results show that bias reduction of WRF wind speed is achieved by running a KF through each station’s storm history and spatially interpolating the estimated WRF bias to the NWP model domain using UK. The KF bias estimate provides a 50% reduction of the median bias at station locations, before spatial interpolation. A higher-order polynomial in the KF does not improve the correction and creates instabilities in the filter. In terms of bulk statistics, Kriged KF bias estimates (UK) guided by the KF algorithm, reduce the WRF median bias by 55%, while RMSE reduction is 15%. One expected challenge of UK in this study is the oceanic coverage in the south and east part of the model domain where station availability is scarce and these areas cannot be represented accurately in the model error correction approach. KF offers significant improvement in the overall wind speed systematic error, which is generally preserved by UK (percent correct detections are 66.3% vs 60.8% by WRF) but KF (or the KF–UK combination) cannot improve strong wind speed distributions. This is a limitation of the postprocessing methodology, which is important to address wind speed model error related to storms that exhibit strong winds.
The observations of wind speed from NOAA are readily available to the public at https://madis.ncep.noaa.gov/. The WRF wind speed simulations for the 107 storms used in this study are available to everyone from Dr. Astitha’s group upon request.