## Abstract

A current barrier to greater deployment of offshore wind turbines is the poor quality of numerical weather prediction model wind and turbulence forecasts over open ocean. The bulk of development for atmospheric boundary layer (ABL) parameterization schemes has focused on land, partly because of a scarcity of observations over ocean. The 100-m *FINO1* tower in the North Sea is one of the few sources worldwide of atmospheric profile observations from the sea surface to turbine hub height. These observations are crucial to developing a better understanding and modeling of physical processes in the marine ABL.

In this study the WRF single-column model (SCM) is coupled with an ensemble Kalman filter from the Data Assimilation Research Testbed (DART) to create 100-member ensembles at the *FINO1* location. The goal of this study is to determine the extent to which model parameter estimation can improve offshore wind forecasts. Combining two datasets that provide lateral forcing for the SCM and two methods for determining , the time-varying sea surface roughness length, four WRF-SCM/DART experiments are conducted during the October–December 2006 period. The two methods for determining are the default Fairall-adjusted Charnock formulation in WRF and use of the parameter estimation techniques to estimate in DART. Using DART to estimate is found to reduce 1-h forecast errors of wind speed over the Charnock–Fairall ensembles by 4%–22%. However, parameter estimation of does not simultaneously reduce turbulent flux forecast errors, indicating limitations of this approach and the need for new marine ABL parameterizations.

## 1. Introduction

One of the main challenges facing the offshore wind energy industry in North America is the generally poor ability of numerical weather prediction (NWP) models to predict wind speed and turbulence profiles over ocean (Olsen et al. 2015). Much of this has to do with atmospheric boundary layer (ABL) parameterization schemes in NWP models. Our understanding and modeling of the physics of the marine ABL simply lags behind that of the ABL over land because of the scarcity of high quality, long-term observations. In this study we propose to improve the representation of the marine ABL lower-level winds in the WRF Model (Skamarock et al. 2008).

The coupling of the atmosphere to the surface via fluxes of heat and momentum drives the spatial and temporal evolution of the ABL. Typically in NWP models these fluxes are estimated from the available mean quantities. A powerful set of statistical relationships was derived from Monin–Obukhov similarity theory (MOST; Obukhov 1946; Monin and Obukhov 1954; Lumley and Panofsky 1964; Wyngaard 1973) based on pioneering field campaigns held in the 1960s and 1970s, such as the 1968 Kansas experiment (Izumi 1971), which sought to improve our understanding of the turbulence characteristics in the atmospheric surface layer (i.e., the portion of the ABL closest to the ground).

Currently, MOST is still the basis of state-of-the-science NWP model surface and boundary layer formulations. The universality of its relationships is questionable when applied offshore, however, given that its formulation is based on (incomplete) land-based measurements. Indeed, several past studies have found evidence of deficiencies in MOST when applied to the marine ABL (e.g., Geernaert et al. 1986; Donelan et al. 1993; Rieder et al. 1994; Hare et al. 1997; Edson and Fairall 1998; Sjöblom and Smedman 2002, 2003; Smedman et al. 2003, 2009; Pichugina et al. 2012; Andreas et al. 2012).

Obukhov (1946) and Monin and Obukhov (1954) formulated their similarity hypothesis on the statistical nature of the turbulent flow based on the relationship between mechanical and thermal forcing in the surface layer. In MOST, the structure of the turbulent flow is determined by the height above the surface (), buoyancy (), friction velocity (), and surface vertical buoyancy flux (), where is the virtual potential temperature and is the acceleration due to gravity (Wyngaard 1973). Both and can be expressed as functions of the surface stress and heat flux, respectively:

where is the surface stress vector (i.e., the surface value of the momentum flux); , , and are the surface values of the sensible heat flux, temperature, and latent heat flux, respectively; is the air density; is the specific heat at constant pressure; and is the latent heat of vaporization of water. In the surface layer, fluxes are assumed constant with height.

A proper treatment of the marine ABL requires additional scaling parameters to account for the momentum production and dissipation generated by the highly variable ocean waves (Sullivan et al. 2008). Hanley and Belcher (2008) found that ocean wave momentum flux can affect the entire marine ABL wind profile. Following Edson and Fairall (1998), we can define a wave boundary layer (WBL), which is the portion of the surface layer in the marine ABL where the ocean wave contribution to the turbulence budget is present.

Observations of key quantities in the WBL are quite limited (Smedman et al. 2009) and, when available, are sparse in time and space (Pichugina et al. 2012). The state estimation problem, usually called data assimilation (DA) for the atmosphere, is the process by which prognostic variables in a model are estimated and updated using observations. This process can be extended to include parameter estimation [in this case via state augmentation; e.g., Aksoy et al. (2006); Golaz et al. (2007); Ruiz et al. (2013); Hacker and Angevine (2013); Bellsky et al. (2014)]. Thus, a DA system provides a theoretically optimal framework for combining observations and numerical simulations, filling the gaps in the observational datasets with model estimates while simultaneously optimizing model parameters.

As we describe in detail in section 2, joint state-parameter estimation allows new insights into model dynamics and errors. If implemented properly, very short-range predictions (minutes to hours) skillfully predict the future atmospheric state. The model, then, is representing the state of the system with the greatest fidelity achievable given its formulation and a set of observations. The model state can then be examined for internal balances, budgets, and error structures. By combining an atmospheric modeling system with observations of the surface layer in the marine ABL via a DA system, we can verify the positive impact of the parameter estimation techniques on low-level wind predictions, which are critical to reducing market barriers to offshore wind energy production.

The purpose of this study is to examine the impact on hub-height offshore wind 1-h forecasts that result from the estimation of a key parameter in the marine ABL using ensemble DA and a single-column model (SCM). Secondarily, we want to find out if estimation of this key parameter also improves the prediction of the momentum fluxes, which play an important role in ABL development. These findings will provide some indication of how much model error in the marine ABL is or is not explained via estimation of this parameter.

This paper is organized as follows. Section 2 describes joint state-parameter estimation. Section 3 provides information on the meteorological tower we use for observations. The experimental design is laid out in section 4. Results are presented in section 5, and section 6 contains a summary and concluding remarks.

## 2. Joint state-parameter estimation

### a. Background

Ensemble-based filters offer some benefits for joint state-parameter estimation, including implementation simplicity and access to probabilistic predictions and estimates. The statistical analysis equations are optimal for linear dynamics and Gaussian statistics [i.e., the Kalman filter (KF); Kalman (1960)], and provide the best linear unbiased estimate of the state of the system. Evensen (1994) first proposed the ensemble Kalman filter (EnKF), which is a Monte Carlo approximation to the KF. In this study, the type of EnKF that we use is the ensemble adjustment Kalman filter (EAKF; Anderson 2001), as implemented in the Data Assimilation Research Testbed (DART; Anderson et al. 2009). The statistical analysis equations solved by the EAKF are given here for completeness:

In Eqs. (3)–(5), is the analysis (posterior) state, is the forecast (prior) state, is the observations, is the linear observation operator, is the Kalman gain matrix, is the forecast error covariance matrix, is the analysis error covariance matrix, is the (diagonal) observation error variance matrix, and is the identity matrix.

The covariance propagation is completed by an ensemble prediction beginning from the analysis, with perturbations according to , where the prior covariance is estimated from the ensemble sample of forecasts. Forward operators project model estimates in observation space. Here, those consist of interpolations from the model grids to the observation locations for any variable that the NWP model carries as either a predictive or diagnostic variable.

Skillful short-range predictions lead to small corrections (increments) by the observations to the prior ensemble sample estimate. Those increments contain useful information about the model itself. This information results from the fact that in an EnKF the forecast state is temporally propagated with , where is the model dynamics (state propagator) and is the model error propagator. Although the KF assumes and are linear, an EnKF allows nonlinear and , which is generally the case. The function can be parameterized (e.g., Danforth and Kalnay 2008), but it is almost always assumed *Q* = 0 because its form is not known. Through parameter estimation, we aim to better estimate and understand the sources of within an ABL parameterization over an ocean surface, with the ultimate goal of improving the hub-height wind.

Over many updates, distributions of increments can be assembled. We then define *systematic increments* as the expectation of that distribution, , which may evolve with a diurnal periodicity in the ABL. Because the expectation of increments with a perfect model would be zero, we interpret nonzero systematic increments as arising from model inadequacy. By showing that the model predicts the observations accurately (it is constrained by the state estimation to skillfully predict the observations), it can be concluded that the observation error is properly accounted for. Thus, the increments in unobserved model components, which must be correlated with the observed components, are most likely to be consequences of errors in the model (Delle Monache et al. 2006, 2008).

Assimilation of near-surface observations within an SCM and an ensemble filter has already been explored (Hacker and Snyder 2005; Hacker and Rostkier-Edelstein 2007; Hacker et al. 2007; Rostkier-Edelstein and Hacker 2010). Hacker and Angevine (2013) quantified model errors with short-range predictions and increments from the joint state-parameter estimation with the SCM version of the WRF NWP model, DART. Even though the results in that study were obtained for areas over land, we believe that the technical concept also applies to offshore simulations. The results of Hacker and Angevine (2013) suggested the complexity of model inadequacy; a model change can easily lead to improvement in one model variable and skill reduction in another.

Model error can take either a parametric or functional form. Each parameterization in an NWP model can generally be written as a series expansion of the parameterized term in the model equations (Nielsen-Gammon et al. 2010). Parametric errors exist because coefficients on terms in that series expansion are rarely known perfectly. More often, they are derived from a limited number of field experiments or are heuristically tuned within the NWP model. But the functional form of the series expansion can be incorrect; we refer to this as structural error, and its form is typically not known a priori.

State augmentation offers a straightforward and theoretically optimal way to quantify parametric errors. Parameters can augment the state vector , and as long as correlations between a function of the model state and an observation are distinguishable, we can estimate with an ensemble filter such as those available in DART. Because we have no dynamical model for the parameters, they evolve as persistence. That is, the analysis distribution is equal to the prior distribution at the next assimilation time. Challenges with parameter estimation include (possibly physical) constraints on the bounds of values that parameters can take, the unknown shape of their probability density functions, and unknown temporal propagators. Also, parameter distributions can result from many error sources in the model; they can take on values that compensate for, and therefore mask, other errors in the model formulation. Parameter estimates may not be “correct” even if they provide superior analyses and forecasts (Ruiz and Pulido 2015). Nevertheless, parameter estimation is useful when it results in a more accurate prediction of the observations. Data assimilation increments are then smaller, and residual errors may indicate structural errors in a model (e.g., Hacker and Angevine 2013) or nonlinearity.

### b. Sensitivity tests

We performed sensitivity tests with the WRF-SCM on various parameters to see which would impact the model wind speed in the marine ABL. We tested a multiplicative constant in the equation for the convective velocity scale and a damping coefficient that controls the height at which blending occurs between different diagnostics of ABL height in neutral/convective and stable conditions in the Mellor–Yamada–Nakanishi–Niino (MYNN; Nakanishi and Niino 2006) ABL scheme. Over a wide range of values, neither of these parameters had a substantial impact on hub-height wind speeds in the marine ABL. We then identified the time-varying sea surface roughness length as a parameter that has a substantial impact on wind speeds in the marine ABL. For brevity, we do not show the sensitivity test results, though we note that during a 1-week simulation with the WRF-SCM, the average RMSE of wind speed averaged over the depth of the tower (described in section 3a) decreased as we increased the specified value of by orders of magnitude, from 9.8 m s^{−1} with = 0.0001 m to an average RMSE of 4.7 m s^{−1} with = 0.1 m.

### c. Parameter estimation of

The MYNN surface layer scheme in the WRF Model uses the Charnock (1955) formulation to calculate every time step at every grid point:

where is the Charnock parameter, an empirical constant. In the MYNN scheme’s implementation of the Charnock formulation, is not strictly a constant but increases linearly with 10-m wind speed from a value of at speeds ≤10 m s^{−1} to a value of for speeds ≥18 m s^{−1}, following Fairall et al. (2003). Edson et al. (2013) further refined the Charnock formulation with additional dependence of on wind speed, and Jiménez and Dudhia (2014) have shown that, especially at higher wind speeds and over shallow (~30 m) ocean waters, the Charnock formulation underestimates the surface roughness by one to two orders of magnitude, and that when this underestimation is properly accounted for, there is much better agreement between modeled and observed wind speeds than there is using the Fairall et al. (2003) or Edson et al. (2013) Charnock formulations. This finding helps to guide the range of values over which should be allowed to vary in our parameter estimation experiments, which in the DART filter we limit to *z*_{0} = [1.0 × 10^{−5}, 1.0] m. If any values of fall outside that range during the parameter estimation process, the initial range of values is then contracted and shifted as necessary so that all values stay within the prescribed range.

In this study, for each experiment we initialized in each ensemble member individually using a random number generator. As the specified range for spans five decades, the random number generator was used to specify the real-valued exponent from a uniform distribution instead of specifying the value directly (i.e., ). Of course, when estimating a parameter that may vary over orders of magnitude, any values toward the upper end of that range dominate the ensemble mean and spread calculations, which can complicate the interpretation of the results. We apply spatially varying adaptive state-space inflation (additive) to both and all the state variables (Anderson 2009). Additionally, between assimilation times, the value of was not allowed to change in WRF.

## 3. *FINO1* platform

### a. Tower description

We utilized meteorological observations from the *Forschungsplattformen in Nord- und Ostsee 1* (*FINO1*; http://www.fino-offshore.de/en) tower. *FINO1* is one of three instrumented, continuously operating measurement towers installed in the German bight. *FINO1* and *FINO3* are located in the North Sea while *FINO2* is in the Baltic Sea. *FINO1* is an approximately 80-m tower mounted on a platform 20 m above the lowest astronomical tide (LAT), giving measurements of atmospheric conditions to approximately 100 m above LAT. The *FINO1* platform was constructed in autumn 2003, and is located in the North Sea approximately 50 km northwest of Emden, Germany, in a water depth of approximately 30 m (Fig. 1). *FINO1* was built partly to quantify the wind characteristics before the Alpha Ventus offshore wind farm was built nearby. Construction of the Alpha Ventus wind farm started with the installation of the substation in July 2008. This analysis uses data from 2006, when the winds measured at the site represent the undisturbed flow.

A variety of data are collected at different heights on the tower, as listed in Table 1. There is only a single-cup anemometer and wind vane at each level on the tower, and booms are a similar length to the tower face width. The wind speed measurements are potentially impacted by flow speeding up or slowing down around the tower structure. The cup anemometers are mounted on booms that extend in the direction 127°–135°; when winds are from approximately 310° ± 30°, wind speed measurements should be considered potentially impacted (or “shadowed”) by wakes from the tower structure. For the wind and stability climatology analysis in section 3b, no filtering was done based on wind direction. For the period of October–December 2006, our analysis found that wind components could be derived for at least one height on the tower from nonmissing and nonshadowed wind speed and direction observations over 86% of all 10-min observations, with the vast majority of continuous periods without any valid wind component observations being less than 1 h in duration (not shown). For the DA experiments we excluded the shadowed wind observations.

The *FINO1* tower has three sonic anemometers, located at 40, 60, and 80 m above LAT, to measure the 3D velocity components at a frequency of 10 Hz. Although measurements from the sonic anemometers were available throughout 2006, the 60-m sensor was not operational during the last half of the year. Leveraging the previous work of Sanz Rodrigo (2011), turbulent fluxes were obtained by averaging the velocity covariances over a 1-h period to reduce random flux errors. Some additional details about the flux measurements and calculation of the covariances are included in appendix A.

### b. Climatology of wind and atmospheric stability conditions

Atmospheric stability can change depending on the wind speed and thermodynamic profile of the ABL. The bulk Richardson number quantifies atmospheric stability, using the ratio of the change in buoyancy to the change in wind speed with height:

where is the virtual potential temperature change with height from the *FINO1* platform base to 100 m above LAT, is the mean virtual temperature, and is the wind speed change with height from the *FINO1* platform base to 100 m above LAT.^{1} The virtual temperature is a function of height, pressure, dry-air temperature, and water vapor content. The wind speed is the average wind speed measured by the cup anemometers. In this case, data are averaged over 10 min to give a representative state of the ABL in that interval.

Atmospheric stability at the *FINO1* location shows a slight relationship to wind speed (Fig. 2). In Fig. 2 data have been categorized according to ranges of stability that are typically used for the ABL:

strongly stable, > 0.25;

stable, 0.05 < < 0.25;

neutral, −0.05 < < 0.05;

unstable, −10 < < −0.05; and

strongly unstable, < −10.

Neutral conditions are most often seen in February, March, April, and December when wind speeds exceed 5 m s^{−1}. During the winter and early spring months there often may not be a large difference in temperature between the air and the sea surface, making it easier to achieve equilibrium conditions.

A goal of this study was to identify when conditions were neutral or slightly convective, which is when we can expect MOST to be most appropriate. This is why we chose a conservative threshold for the strongly stable conditions above, to safely avoid periods where those conditions predominated. The time series of binned Richardson numbers is plotted in Fig. 3, from which it can be seen that there are several periods when the marine ABL stability was neutral for 24 h or more. These occurred on 6–7 February, 28 March, and 5, 13–15, and 30 December 2006. Figure 3 also shows that stable conditions predominated from April through July, while the months where mainly neutral and slightly convective conditions were observed were January, February, October, November, and December.

Wind speed profiles, grouped by stability class, are similar to what would be expected from theory (not shown). In unstable conditions, there is little variation of wind speed with height, while stable conditions show a steeper, positive gradient to the wind speed.

Based on these results, and the desirability for the simulations to be conducted over continuous periods, the selected periods at *FINO1* were from October to December 2006.

## 4. Experiment design

The WRF-SCM provides a platform for more rapid testing than would be available with the WRF 3D model. In cases where the marine ABL is weakly forced by horizontal mesoscale gradients, the SCM is also broadly applicable as a representative model of the local conditions in horizontally homogeneous terrain. Because the *FINO1* tower is located approximately 50 km offshore, we can also assume that the effects from the land–sea interface are limited. Simulations with the WRF-SCM are relevant to the marine ABL because the subgrid scale can be isolated. To ensure results remain relevant to 3D dynamics, the 3D WRF is used to provide vertical velocity tendencies, geostrophic wind components, and advective tendencies. Thermal profiles and clouds can advect through the SCM domain, and realistic treatment of those may be necessary to achieve the most accurate energy balances at the surface (Wulfmeyer and Janjić 2005; LaCasse et al. 2008). But whether this is the case is unknown for micro and mesoscale interaction, especially when performing DA (e.g., Rostkier-Edelstein and Hacker 2010).

### a. WRF 3D runs for SCM forcing

Version 3.5.1 of the Advanced Research version of WRF (WRF-ARW) was set up over northern Europe, centered over the *FINO1* platform, to provide lateral forcing for the SCM–DART experiments from 1 October to 31 December 2006. One 3D run did not use any DA (“Baseline”) while the other (“FDDA”) used NCAR’s community four-dimensional data assimilation (FDDA) system, which has been extensively tested over the years and is known to provide a good fit with the available observations (Stauffer and Seaman 1990, 1994; Stauffer et al. 1991; Jonassen et al. 2012). The assimilated observations were from NCEP Automated Data Processing data streams stored in NCAR Research Data Archive datasets 353.4 and 461.0 (NCAR 1980, 2004). The observation types that were assimilated in those experiments include radiosondes, surface stations, buoys, ships, and aircraft reports. Figure 4 shows the location of the *FINO1* platform (54.014861°N, 6.587639°E) and the 3D WRF computational domains. Additional details about the 3D WRF Model setup are contained in appendix B.

### b. SCM–DART ensemble forcing description

Previously, common strategies for generating initialization and lateral forcing data for SCM experiments relied upon a large climatology of input data to randomly draw from (e.g., Hacker and Rostkier-Edelstein 2007). The new EOF approach that we describe below is attractive because it requires that only a single 3D WRF run be made for the same period of the SCM experiment to provide the forcing, rather than several years of (in this case) October–December simulations.

Lateral forcing conditions for the 100-member SCM–DART ensembles were produced with mean forcing from the WRF 3D runs and ensemble perturbations derived by empirical orthogonal functions (EOFs) of each relevant variable from 3D WRF output over each monthly period. Unique EOFs for each hour of the day were found for each monthly simulation period. These perturbations’ magnitude roughly matches the RMSE of the 3D WRF at the location of the *FINO1* tower for the monthly time period selected in order to assure statistical consistency of the lateral information driving the SCM (i.e., that the perturbation magnitudes were, on average, comparable to the error magnitudes in the 3D WRF simulation that month). The appropriate size of the perturbation magnitudes was determined prior to running SCM–DART. The perturbations were a linear combination of the first several EOFs, whose coefficients were random numbers constrained by an autoregressive [AR(1)] process, to ensure temporal consistency. Perturbations were only made to the atmospheric variables; all 100 SCM–DART members used identical, unperturbed SST and subsurface forcing conditions. These subsurface forcing files were also generated from the WRF 3D simulation. From these subsurface, SST, and atmospheric forcing files, an ensemble of “wrfinput” files were generated for the SCM–DART simulations. When running the SCM, however, we turned off the advection of wind, temperature, and moisture so as not to wipe out the effects of the assimilation. However, we kept the vertical advection on, which meant that, effectively, the only relevant perturbations in this study are to the geostrophic wind field.

Additionally, we made some code improvements to the WRF-SCM for use over ocean grid points, as described in appendix C.

### c. SCM–DART experiments

With the goal of reducing wind speed error by joint state-parameter estimation and understanding the sources of these errors, the following experiments were conducted with the coupled WRF-SCM–DART system for month-long periods covering October–December 2006, using WRF version 3.5.1 and modifications to both WRF and DART that allow parameter estimation of the time-varying surface roughness length :

“Baseline”

**—**SCM–DART runs with forcing from 3D WRF without FDDA and with calculated by the default Charnock–Fairall formulation in WRF;“Baseline+ZNT_est”

**—**SCM–DART runs with forcing from 3D WRF without FDDA and with estimated hourly by DART;“FDDA”—SCM–DART runs with forcing from 3D WRF with FDDA and with calculated by the default Charnock–Fairall formulation in WRF; and

“FDDA+ZNT_est”—SCM–DART runs with forcing from 3D WRF with FDDA and with estimated hourly by DART.

Our hypothesis was that both the higher quality boundary conditions and the estimation with DA of would improve 1-h wind speed forecasts at *FINO1*.

For all experiments we used 1-h DA cycling with the EAKF implemented in DART. Each SCM–DART ensemble had 100 members. The WRF prognostic variables updated by DART were all three wind components, as well as the geopotential height, potential temperature, pressure, and water vapor mixing ratio. We assimilated the 10-min observations of the *u* and *υ* wind components, temperature, and dewpoint temperature from multiple levels on the *FINO1* tower, valid at the top of every hour ±10 min. Observations outside of that 20-min window surrounding the top of the hour were discarded.

We conducted several month-long sensitivity tests to tune the observation error variances for each observation type. The goal of this tuning was to identify the observation error variances that led to RMSE values that roughly equaled the total spread (the sum of the ensemble spread and the observation error). When the RMSE and total spread equal each other, the ensemble is statistically consistent. We set the observation errors to be 1.0 m s^{−1} for wind components, 0.25°C for air temperature, and 0.5°C for dewpoint temperature (square those values to get the observation error variances). These observation error variances were used in all experiments.

For nearly all geophysical models using ensemble Kalman filters for DA, covariance localization is required in both the horizontal and vertical. This localization process sets a cutoff or maximum distance beyond which an observation is not allowed to influence the state variables (Houtekamer and Mitchell 1998; Anderson 2012). Here, we use the Gaspari–Cohn fifth-order piecewise rational correlation function of space [Eq. 4.10 in Gaspari and Cohn (1999)] for localization, which is the default localization function in DART. Vertical covariance localization in the EAKF algorithm was set to 4 km after several sensitivity tests. Horizontal covariance localization, meanwhile, of course has no impact on DA in SCM experiments. Additionally, DART’s standard spatially varying state space inflation was applied to the prior state in order to increase the ensemble spread at every assimilation time. The namelist settings guiding the inflation were chosen according to prior experience with DART and were not tuned in this study.

Examples of a monthly time series of SCM–DART ensemble mean 1-h forecasts of wind speed and temperature at *FINO1* compared against observations and the 3D WRF simulation that drove the SCM–DART ensemble are shown in Fig. 5 for the FDDA+ZNT_est experiment. For the wind speed plot in Fig. 5a, the 1-h ensemble mean forecasts (red) are on average of similar accuracy as the 3D WRF analysis wind speed at *FINO1* (black), and the model predictions track the observations reasonably well. For the temperature plot in Fig. 5b, the ensemble mean forecasts closely match the observations and have smaller errors than the 3D WRF analysis. We would expect the SCM–DART ensemble mean 1-h forecasts to match the observations at *FINO1* more closely than the 3D WRF analysis matched them because the SCM–DART runs assimilated *FINO1* data, while the 3D WRF assimilated a different set of observations that did not include *FINO1* data. That said, the only forcing to the SCM was provided by geostrophic winds. Thus, for winds, the perturbed forcing fields imposed an additional tendency away from the observations. The SCM–DART ensemble mean for temperature hewed close to the observations because there was no perturbed forcing for temperature.

### d. Verification metrics

For several variables the primary verification metrics we consider across all experiments are the standard RMSE and the Pearson correlation coefficient. The RMSE can also be broken down into components, the bias error, and the centered RMSE (CRMSE) (Murphy 1988; Taylor 2001):

The bias error is the systematic component of the RMSE, and CRMSE is the random portion of the RMSE. We calculated the RMSE and its components against the same set of *FINO1* observations that were assimilated by DART.

## 5. Results

An example time series of the evolution of for the December 2006 simulations for all four SCM–DART experiments is displayed in Fig. 6, where we can see that results for the Baseline and FDDA experiments are similar to each other, generally in the 10^{−5}–10^{−3} m range. The DART-estimated experiments had values of that were consistently two to three orders of magnitude larger than those with Charnock–Fairall-estimated . These large differences in between the two estimation methods do impact wind speeds, as discussed in the following paragraphs.

Performance metrics for the four SCM–DART experiments described above (Baseline, Baseline+ZNT_est, FDDA, and FDDA+ZNT_est) are shown in Figs. 7–9 for wind speed, the *u*-wind component, and the *υ*-wind component, respectively. For the month of October 2006, the results reflect the initial hypothesis, that is, that both the higher quality boundary conditions and the estimation with DA of a key parameter, the surface roughness , improved the wind estimates at the *FINO1* tower. The best results were obtained when both approaches were combined, and the largest improvements resulted from the estimation. These improvements resulted primarily in a lower bias with very little change in the CRMSE. In all cases the correlation between the priors and observations was preserved, which is a desirable feature, while also generating an overall substantial reduction of the RMSE. However, for November and December, the impact of the forcing provided with the WRF 3D runs with FDDA did not consistently result in improvements based on the shown metrics owing to the lack of observations near *FINO1*, while the estimation always resulted in an improvement. For wind speed (Fig. 7), the simulations with DART-estimated had RMSE values ranging from 4% to 22% better than in the simulations of Charnock-Fairall-estimated , depending on the month (i.e., comparing Baseline+ZNT_est to Baseline, and comparing FDDA+ZNT_est to FDDA). Similar results were obtained for the temperature and dewpoint temperature (not shown).

Figure 10 shows, for the month of October 2006 only, an analysis of the same experiments presented in Figs. 7–9, but by looking at the vertical profile of the same metrics at *FINO1* (RMSE, blue; bias, black; and CRMSE, green), with the addition of total spread (red), which is the sum of the ensemble spread and the observation error. When the total spread is roughly equivalent to the RMSE of the ensemble mean, it indicates that the probabilistic prediction system is able to quantify the prediction uncertainty. Aside from indicating a more optimal DA system, predicting this uncertainty is a key factor in supporting cost-effective decision-making. Results are shown for the zonal (*u*) component for all four experiments during October 2006. Similar results were obtained for the meridional (*υ*) component, wind speed, temperature, and dewpoint temperature, and also for the months of November and December (not shown). Also shown in Fig. 10 is the number of available observations across the different vertical levels (cyan) over the entire month and the number of assimilated observations (magenta).

The first aspect to notice in Fig. 10 is the limited vertical variability of the metrics, which makes the analysis shown in Figs. 7–9 of averages across the entire tower representative of the whole column. Also, no wind component data were available at 61 m because of the 61-m wind vane being nonoperational during the last half of 2006. When was estimated by the SCM–DART system (top-right and bottom-right panels in Fig. 10), the RMSE and bias errors were lower, and the RMSE and total spread were closer together, than when was calculated by the Fairall et al. (2003)-adjusted Charnock formulation in WRF (top-left and bottom-left panels in Fig. 10). The lowest RMSE of all is shown in the bottom-right panel of Fig. 10, which corresponds to the run where the WRF 3D run with FDDA provided the forcing, and was estimated by DART. An overall reduction of the centered RMSE was also observed in the two experiments where was estimated by DART, although the result was not as large as the improvements shown for the bias. The larger reduction in bias error than CRMSE indicates that using joint state-parameter estimation to estimate with DART is primarily reducing a substantial portion of the systematic model error for wind forecasts over the lowest 100 m of the marine ABL at the *FINO1* location.

Based on the overall reduction in RMSE in our experiments where DART estimated , we thought it important to dig deeper and examine the marine ABL turbulent stresses in these experiments. The MYNN ABL scheme has a namelist option for users to record the turbulence budget variables at every output time, and we derived and calculated the turbulent stresses and from these budget variables in the following manner. The shear production term, , is related to the stresses by

where and are the vertical shear of the horizontal wind components. Also, because the eddy viscosity is used to compute the turbulent stresses, the stress vector and shear vector ( and ) are aligned. With this condition,

From Eq. (9) we can compute and obtain the turbulent stresses:

We compared these modeled stresses to observed stresses from sonic anemometers on the *FINO1* tower located at heights of 40 and 80 m (the 60-m sonic anemometer was not operational during October–December 2006). Figures 11 and 12 show the same statistics as Figs. 7–9, but for the turbulent stress at 40 m and at 40 m (the statistics for the stresses at 80 m were similar and thus not shown), respectively, with a time series of the stresses at both 40 and 80 m plotted in Figs. 13 and 14.

In Fig. 11 the RMSE for is quite low for both experiments that use the Charnock–Fairall formulation to calculate (Baseline and FDDA), with the bias over each month being nearly zero. For both sets of forcings, the RMSE, CRMSE, and bias were larger for the experiments where DART estimated than those that used the Charnock–Fairall formulation. Similar results are seen in Fig. 12 for , except with larger-magnitude errors and small correlation coefficients compared to . The lack of correlation between the observed and predicted results can be seen clearly in the time series in Fig. 14. It is not clear why the correlations for were so much larger than for . We speculate there is more variability in the *υ* component than in the *u* component, so any quantities dependent on *υ* are inherently less predictable. Because the errors in the turbulent stress increased with DART estimating , while wind speed errors decreased in the same experiments, and because the modeled fluxes were not well correlated with observations, it appears that estimating is compensating for errors that come from a source other than the fluxes.

Overall, these results confirm one of the hypotheses of this project: 1-h forecasts of low-level winds in the marine ABL are improved by using our joint state-parameter estimation procedure. These improvements will positively impact the offshore wind energy industry. That is, estimating uncertain parameters of NWP boundary layer schemes over water for neutral-to-unstable conditions within a DA framework, even with a single-column model, leads to noticeable improvements in short-range wind predictions.

While the hub-height wind predictions improved with joint state-parameter estimation, the turbulent stress predictions did not. This result means there are still structural sources of error that are unaccounted for by our method of estimating . Further research is required to uncover and illuminate these sources of errors in parameterizations of the marine ABL.

## 6. Summary and conclusions

In this study we have improved NWP analyses and forecasts of low-level winds in the marine boundary layer as estimated and predicted with an SCM and ensemble DA system. This has been accomplished using the WRF single-column model (WRF-SCM), state estimation algorithms from DART, and observations of key quantities of the lower marine ABL at the *FINO1* tower in the North Sea, including temperature and winds at multiple levels above the sea surface.

We modified the WRF-SCM to run properly, not just over land points but over water points as well; developed a new lateral forcing and perturbation strategy for SCM ensemble experiments; and created 100-member WRF-SCM ensembles using the ensemble adjustment Kalman filter from the DART toolkit. For each month of October, November, and December 2006, we performed four ensemble experiments. These four experiments used two different 3D WRF datasets to provide the lateral forcing for the SCM ensemble (first, a baseline run without any DA and, second, a run with FDDA), as well as two different techniques to determine the sea surface roughness length, (first, a default configuration using the Charnock–Fairall formulation in the MYNN ABL scheme in WRF every model time step and, second, joint state-parameter estimation algorithms in DART to estimate every assimilation cycle).

We found that regardless of the forcing data, the wind speed RMSE was lower in every month by 4%–22% when was estimated with DART than when was calculated using the default Charnock–Fairall relation in WRF. The same was true for nearly all other sensible variables (e.g., temperature and wind) and months as well. Some of that reduction in RMSE came from the random component (CRMSE), but the systematic component of the RMSE (bias error) was reduced even more than the CRMSE. However, for the turbulent fluxes, the RMSE increased substantially in the experiments where was estimated by DART compared to the experiments where was calculated by the Charnock–Fairall relation. Thus, while using joint state-parameter estimation to calculate new estimates for conferred clear benefits on reducing hub-height wind speed error, this parameter estimation is compensating for errors that do not come from the turbulent fluxes. Future study is required to determine the source(s) of these errors.

The lack of improvement in the prediction of the turbulent fluxes indicates that there is potentially a structural problem with the parameterization of surface effects in the marine ABL. Parameterization of wave effects using the simple Charnock–Fairall relation, where the assumption is that the wind and waves are in equilibrium, may be problematic because there is no way to account for the effects of swell. Thus, new parameterizations accounting for the swell or other wind–wave nonequilibrium effects are likely necessary to improve offshore wind predictions further.

The experiments with the WRF-SCM–DART system have led to large improvements with respect to a standard WRF configuration, which is currently used by the wind energy industry. The WRF-SCM appears to be a tool particularly suitable for offshore wind energy short-range (1 h) forecasting applications given its accuracy, the ability to quantify uncertainty, and the minimal computational resource requirements. For resource assessment applications, where one needs to predict turbulence levels in addition to the wind, it is unclear whether parameter estimation would be the most appropriate approach, however. In situations where the impact of an upwind wind farm may be of interest in a downwind location, a 3D modeling approach is suitable. Work to apply similar joint state-parameter estimation techniques to a 3D WRF–DART ensemble and examine its suitability is ongoing.

## Acknowledgments

This project was funded by the U.S. Department of Energy under Contract DE-EE0005374. We would like to acknowledge high-performance computing support on Yellowstone (ark:/85065/d7wd3xhc) provided by NCAR’s Computational and Information Systems Laboratory, which is sponsored by the National Science Foundation. Additionally, several of the analysis and plotting scripts were coded using the NCAR Command Language (NCL; doi:10.5065/D6WD3XH5). *FINO1* data were provided by the German Bundesamt für Seeschifffahrt und Hydrographie under agreements with the National Renewable Energy Laboratory (NREL) and NCAR. We thank the Danish Meteorological Institute (DMI) for providing the high-resolution SST analyses for the region around northern Europe. The authors also thank Sue Ellen Haupt (NCAR), Terri Marshburn (NREL), and two anonymous reviewers for helpful comments that improved the manuscript.

### APPENDIX A

*FINO1* Flux Measurements

*FINO1*

Using the planar fit method, measurements from the sonic anemometers were rotated to the mean streamline coordinate system (Wilczak et al. 2001). Because the mean streamline plane and the sea surface must be parallel to each other, the mounted tilt angles can be inferred and corrected for. During an evaluation period in the first half of 2006, tilt angles from 1.0° to 1.8° were found for the three sonics (Sanz Rodrigo 2011). Using the eddy-correlation technique over a time scale of 5 min, variances and covariances of detrended velocity and sonic temperature perturbations were computed.

### APPENDIX B

#### 3D WRF Model Setup

For the 3D WRF Model described in section 4a that provided the forcing conditions for the SCM ensemble, there were three one-way nested domains (Fig. 4) with horizontal grid spacing of 30, 10, and 3.3 km, respectively, with 132 × 132 × 41 grid points for each domain. There were 41 vertical levels, with eight levels in the first 500 m above MSL at approximately 15, 45, 75, 100, 130, 195, 315, and 490 m.

The physical parameterizations chosen follow work presented by Draxl et al. (2014). This configuration was developed for North Sea applications by the Department of Wind Energy at the Technical University of Denmark (DTU). The physics options include the Thompson microphysics scheme (Thompson et al. 2008), the Kain–Fritsch cumulus scheme (Kain 2004) for domains 1 and 2 only (with explicit convection on domain 3), sixth-order numerical diffusion, and positive-definite advection of moisture and scalars. The Rapid Radiative Transfer Model (Mlawer et al. 1997) and Dudhia schemes (Dudhia 1989) are used for longwave and shortwave radiation calculations, respectively. The 2.5-level Nakanishi and Niino (2006) ABL and its companion surface-layer scheme (MYNN) were selected.

The model was initialized with global data from the NOAA Climate Forecast System Reanalysis (CFSR) project (Saha et al. 2010). The CFSR data are 6-hourly global reanalyses at 0.5° grid spacing, valid from 1979 to 2010. The 6-hourly data are also used for lateral boundary conditions along the outermost domain. In addition, daily SST data from the NCEP Optimum Interpolation Sea Surface Temperature (OISST) project (https://www.ncdc.noaa.gov/oisst) were used. Where available over the North Sea and North Atlantic, SST analyses from the Danish Meteorological Institute (DMI) were used. The DMI SST product (http://www.myocean.edu) is at 1/30° grid spacing, and was blended with the 1/12° NCEP OISST data near the DMI SST dataset’s outer boundary (Fig. B1). Land-use categories come from MODIS.

### APPENDIX C

#### WRF-SCM Improvements

The WRF-SCM had originally been designed for use over land, so we made several updates to the community-distributed code in order to use it over water. WRF-SCM users can now specify a set of eta levels in the WRF namelist, rather than allowing the model to generate its own set of eta levels based on a specified number of vertical levels (in this project we used 10 levels from 30 to 210 m with 20-m spacing, and 50 more levels above 210 m). The WRF-SCM now sets the land-use category according to the value read in from the 3D WRF output files that drive the SCM simulations, instead of assuming a land point. Additionally, the WRF-SCM now uses SST for subsurface initialization if the SCM point is over water. These improvements and modifications will be made available to the community in an upcoming release of WRF.

## REFERENCES

*The Structure of Atmospheric Turbulence*. John Wiley and Sons, 229 pp.

*Proc. EWEA Annual Conf. and Exhibition 2015*, Paris, France, European Wind Energy Association. [Available online at http://proceedings.ewea.org/annual2015/conference/programme/info2.php?id2=630.]

*Proc. EWEA Offshore 2011 Conf. and Exhibition*, Amsterdam, Netherlands, European Wind Energy Association. [Available online at http://www.ewea.org/offshore2011/index.php?id=152.]

*Workshop on Micrometeorology*, D. A. Haugen, Ed., Amer. Meteor. Soc., 101–149.

## Footnotes

^{a}

The National Center for Atmospheric Research is sponsored by the National Science Foundation.

^{1}

Ideally, , where and are the zonal and meridional wind components, would be used instead of . However, the *FINO1* wind vanes were less reliably operational than the cup anemometers, so this version is used instead.