The Use of Uncertainty Quantification for the Empirical Modeling of Wind Turbine Icing

A novel uncertainty quantification method is used to evaluate the impact of uncertainties of parameters within the icing model in the modeling chain for icing-related wind power production loss forecasts. As a first step, uncertain parameters in the icing model were identified from the literature and personal communications. These parameters are themedian volume diameter of the hydrometeors, the sticking efficiency for snow and graupel, the Nusselt number, the shedding factor, and the wind erosion factor. The sensitivity of these parameters on icing-related wind power production losses is examined. An icing model ensemble representing the estimated parameter uncertainties is designed using so-called deterministic sampling and is run for two periods over a total of 29 weeks. Deterministic sampling allows an exact representation of the uncertainty and, in future applications, further calibration of these parameters. Also, the number of required ensemble members is reduced drastically relative to the commonly used random-sampling method, thus enabling faster delivery and a more flexible system. The results from random and deterministic sampling are compared and agree very well, confirming the usefulness of deterministic sampling. The ensemblemean of the nine-member icing model ensemble generated with deterministic sampling is shown to improve the forecast skill relative to one single forecast for the winter periods. In addition, the ensemble spread provides valuable information as compared with a single forecast in terms of forecasting uncertainty. However, addressing uncertainties in the icing model alone underestimates the forecast uncertainty, thus stressing the need for a fully probabilistic approach in the modeling chain for wind power forecasts in a cold climate.


Introduction
Icing on wind turbine blades is a serious problem in cold climates. In addition to the safety risk of the ice falling from the wind turbines, ice accretion has a significant impact on the power produced and can lead to severe production losses during the winter months because of changes in the aerodynamic balance, generation of vibration, and increased load on the wind turbine (Kraj and Bibeau 2010). Despite these problems, the large number of locations with great wind resources has resulted in an increasing amount of wind energy sites in certain regions with cold climates. By the end of 2015, the total installed capacity of wind energy in these regions was 127 GW of the total 432 GW globally (GWEC 2017) and is expected to increase at a yearly rate of 12 GW until 2020 (Bredesen et al. 2017). Shortrange forecasts of icing and related production losses are therefore important in order to plan for the next-day energy production and for site safety.
Modeling icing and related production losses has been done in previous studies (e.g., Davis et al. 2016b;Nygaard et al. 2013;Molinder et al. 2018), showing that these forecasts are uncertain. Initial and boundary conditions as well as shortcomings in the different components of the Denotes content that is immediately available upon publication as open access. modeling chain contribute to this uncertainty ( Fig. 1). Also, kilometer-scale phenomena, such as convective clouds can be misplaced by some tens of kilometers in numerical weather prediction (NWP) forecasts, generating an error in spatial representation resulting in an additional uncertainty, when the forecast is interpreted at a specific wind turbine location. To fully represent the uncertainties in the modeling of icing and icing-related production losses all uncertainties should be accounted for. As a step toward this, the aim of this study is to describe the uncertainties in the icing model formulations and to include them in the usage of the model.
The use of ensemble forecasting is a common approach for estimations of forecast uncertainty for NWP models (Leith 1974). Ensemble forecasting has several advantages: First, the ensemble mean has a lower forecast error than a deterministic forecast because the averaging of the ensemble members filters out less predictable parts of the forecast (Warner 2011), and second, the spread of the ensemble members can be used to estimate forecast uncertainty or making it possible to estimate the likelihood for a specific event. Short-range ensemble forecasting has previously been investigated for wind energy purposes, mainly focusing on the wind speed forecast (e.g., Pinson and Kariniotakis 2010;Traiteur et al. 2013).
For wind turbine icing forecasts, earlier studies have mainly focused on the uncertainties in the initial/ boundary conditions and model formulations of the NWP model (e.g., Molinder et al. 2018;Davis et al. 2014; Thorsson et al. 2015). In Molinder et al. (2018), an NWP ensemble and a neighborhood method accounting for uncertainties in the representation of the wind turbine locations was used confirming the benefits of an ensemble for the examined period. Here we focus on the uncertainties in the icing model and examine which effect they have on the icing-related production loss forecasts. We also investigate the benefit of using an icing model ensemble in terms of increased forecast skill and uncertainty estimations. To our knowledge, this is the first time that a systematic assessment is done on how combined icing model uncertainties affect the icingrelated production loss forecasts.
Several studies have pointed out uncertain parts of the icing model; for example, Nygaard et al. (2013) studied different values of the sticking efficiency for snow. In Davis et al. (2014) different median volume diameters of the cloud droplets were tested showing a large effect on the ice load and Makkonen (2000) pointed out the heat transfer coefficient in the calculation of accretion efficiency as uncertain. Based on a literature study we assess here the five most uncertain parameters in the icing model. Two parameters concern the ice growth (median volume of droplets and sticking efficiency for snow and graupel), two parameters affect the ice loss (a wind erosion factor and an ice shedding factor), and one of the parameters involves both the ice growth and the ice loss (the Nusselt number, affecting accretion efficiency and sublimation). The parameters and their estimated uncertainty are described in more detail in section 4.
To evaluate the impact of the uncertainties on the resulting forecast, uncertainty quantification was used. Uncertainty quantification can be made by generating icing model ensembles with deterministic or random sampling (Hessling 2013). The fundamental difference between the two approaches is that deterministic sampling accurately describes the prescribed uncertainties with deterministically chosen samples, while random sampling estimates those uncertainties with a large random sample. Consequentially, both methods come with errors, that is, deterministic sampling with a systematic error and random sampling with a random sampling error. The latter error can create challenges, when the ingoing model parameters are to be calibrated or improved in small increments. A further more practical advantage is that deterministic sampling reduces the number of ensemble members and thus the computational cost. For a few uncertain parameters, fewer than 10 ensemble members are usually enough to accurately represent the forecast uncertainty distribution. This contrasts with random sampling, where thousands of ensemble members are required to obtain a reproducible result (Hessling 2013). Because of its benefits, the deterministic sampling method has increasingly been used in different applications such as for nuclear power in Rahman et al. (2018). For the application in wind power in cold climates, even though the icing model is relatively computationally cheap, we found it valuable to decrease the number of ensemble members if the same results can be achieved. For the uncertainty quantification of the entire modeling chain (Fig. 1), the computational and data handling costs will increase, since for each NWP ensemble member an icing model ensemble needs to be run owing to the nonlinearity of the involved models. Thus, for 275 NWP ensemble members as in Molinder et al. (2018), the total number of icing model runs would exceed 2 million with random sampling, but only a few thousand with deterministic sampling. For reduced computational and data handling costs as well as faster delivery, this reduction can be crucial. The low number of ensemble members in deterministic sampling thus provides an optimized approach. However, note that the same results can be obtained using random sampling, if the computational cost is not prohibitive. The deterministic sampling method is further described in section 5. The modeling chain will be presented in section 2, followed by the forecast periods and observations, which will be described in section 3. In section 4, the uncertain parameters in the icing model are presented, and the theory of uncertainty quantification can be found in section 5. In the result section (section 6), the model sensitivity to perturbations in the uncertain parameters is shown together with the model performance in terms of forecast error and spread. A summary and discussion are given in section 7.

a. The numerical weather prediction model
The HIRLAM-ALADIN Research on Mesoscale Operational NWP in Euromed-Applications of Research to Operations at Mesoscale (HARMONIE-AROME) model is used for the NWP. It is a nonhydrostatic, convectionpermitting model based on the ALADIN-HIRLAM shared system with the HARMONIE-AROME cycle cy40h1.1 (Bengtsson et al. 2017). The model was run with a model domain covering Sweden with a horizontal resolution of 2.5 km and 65 vertical levels. The model domain is presented in Fig. 2. The control member from the global ensemble system (ENS) at ECMWF with a horizontal grid spacing of 30 km was used as lateral boundary conditions. We employ 6-hourly 3D-variational data assimilation for conventional observations and radiances from the satellites AMSU-A and AMSU-B. For the surface analysis, an optimal interpolation of conventional observations is applied. Details for the typical operational setup of the model can be found in Müller et al. (2017). A 3-week spinup period was used to spin up the variational bias correction of the satellite observations. NWP output parameters at the nearest grid point to each wind park used in the verification and described below serve as input to the icing model. As described in Molinder et al. (2018), the applied topography in the NWP model is not as detailed as the real terrain often causing lower mountaintops in the model than in reality. Because of this we include some height adjustments to the NWP output parameters before they were used in the icing model according to where P i , h m , Dh, and h nacelle denote the vertically interpolated parameter, the model terrain height, the difference between the model terrain height and the real terrain height, and the height of turbine nacelle, respectively. Basically, P i is the average between the forecast made at nacelle height above the actual terrain height and the model terrain height, where we employ linear interpolation between model levels. The choice of interpolation method is based on tests where, for example, adiabatic lifting for the temperature data resulted in less-satisfying results (not shown here). Note that for all wind turbines used in this study, the real terrain was higher than the model terrain. The vertical interpolation was applied to all meteorological input parameters to the icing model.

b. Icing model
The icing model utilizes meteorological parameters forecast by the NWP model as input and is based on the equation for ice accretion first described in Kuroiwa (1965) and later in Makkonen (2000). It is constructed to calculate the ice accretion on a rotating cylinder 0.03 m in diameter according to International Organization for Standardization (ISO) standards (ISO 2001) for icing measurements: where M is the mass of ice; t is the time; the sum contains the ice accumulation due to the different water species iterated with the subscript i, namely, rain, cloud water, snow, graupel, and cloud ice; a 1 , a 2 , and a 3 are efficiency coefficients described below; W is the water content of the different species; y is wind speed; A is the crosssectional area of the cylinder on which the ice accumulation is calculated; and L is the ice loss term, which is an addition to the original ice accretion model. The cylinder diameter increases with increased ice load in the model. L includes functions for melting, sublimation, wind erosion, and ice shedding. The right-hand side in the equation is nonlinear since the efficiencies depend on the wind speed, cross-sectional area and the water content. This will become relevant for the uncertainty quantification. Ice accretion due to cloud ice, snow, graupel, and rainwater included in the model is also an addition to the original ice accretion model. Generally, ice accretion due to snow, graupel, and cloud ice is limited to temperatures above freezing (e.g., Nygaard et al. 2013). However, since HARMONIE-AROME to some extent generates snow, graupel, and cloud ice too fast at temperatures below 08C at the expense of cloud water (Bengtsson et al. 2017; B. J. K. Engdahl 2019, personal communication), ice accretion due to frozen hydrometeors is allowed also at temperatures below freezing if liquid water is present. The efficiency coefficients can adopt values between 0 and 1 and are defined in the following sections.
1) a 1 : COLLISION EFFICIENCY According to Dobesch et al. (2005) the collision efficiency a 1 can be defined as the ratio between the number of particles, for example, droplets, that collide with the surface and the total number of particles on the windward side of the object. The value of a 1 depends on the wind speed/velocity of the particles, particle size, size/shape of the object and air density. Instead of the actual droplet size for each droplet, the median volume diameter (MVD) is used in the model (Finstad et al. 1988). MVD is not an output from the NWP model and is therefore derived using the water content of the hydrometeors and the concentration of particles. In the case of cloud water the droplet concentration is fixed at 300 cm 23 ; in the case of rain, cloud ice, snow, or graupel, the number of particles is based on output from the NWP model. The equations for calculating particle number concentrations for cloud ice, rain, snow, and graupel have been taken from the AROME microphysics scheme (Seity et al. 2011). MVD is calculated according to a scheme for cloud water by Thompson et al. (2008).
2) a 2 : STICKING EFFICIENCY The sticking efficiency a 2 is the ratio of the number of particles that stick to the surface and the number of particles that collide with the surface. The value of a 2 depends on the amount of water that strikes the surface, with a maximum value of 1. For cloud water and rain a 2 is unity, but for snow, graupel, and cloud ice it is lower. For dry snow it is nearly zero since the snow bounces off the surface. Admirat (2008) found that the sticking efficiency for wet snow depends mainly on the wind speed according to a 2 5 1/U 1 , and thus gets closer to unity for low wind speeds. Here we use 1/U 0.75 as a baseline following Nygaard et al. (2013). However, as discussed later in section 4, this ratio is uncertain. It is assumed that if no cloud water or rainwater is present then the snow, graupel, and cloud ice are dry and will always bounce off the surface.

3) a 3 : ACCRETION EFFICIENCY
The accretion efficiency a 3 depends on the heat balance of the icing process. Briefly described, if the water that sticks to the surface freezes immediately, the accretion efficiency is unity. If the heat balance is such that it takes some time for the particle to freeze, some liquid water will fall off the surface, and thus the accretion efficiency will be smaller than unity. Depending on whether the water freezes immediately, the density of the ice accreted on the surface will vary (Dobesch et al. 2005). The heat balance calculations include the heat transfer coefficient, which is described in Makkonen (2000) as uncertain. The so-called Nusselt number is a part of the calculation of the heat transfer coefficient. A lower Nusselt number decreases the accretion efficiency. The Nusselt number and the related uncertainty in the icing model will be discussed further in section 4. For simplicity, here the accretion efficiency a 3 is calculated in the same manner for the liquid and solid water components.
The melting of ice is calculated using an energy balance equation including an ice shedding factor accounting for ice falling off during the melting. The shedding factor is a constant increasing the ice loss due to the melting by a factor of 8 and is based on an estimation by B. E. Nygaard (2017, personal communication) for ice shedding on power lines in a nonpublished study. The sublimation is calculated using the relative humidity and wind speed and utilizes Eq. (19) in Mazin et al. (2001). For wind speeds greater than 3 m s 21 and subzero temperatures, a wind erosion factor is included in the ice loss calculations as an hourly rate linearly depending on the wind speed with a factor of 10 g m 21 (m s 21 ) 21 . This means that a wind speed of 6 m s 21 results in 60 g m 21 ice loss. The contribution of the different water species to the ice accretion are fed separately into the model using their forecast concentrations from the NWP model according to the sum in Eq. (2).

c. Production loss model
The production loss model yields a production loss in percent, which is multiplied with the potential production estimated from the forecast wind to get the final production forecast. The model is described in Molinder et al. (2018). It is based on an empirical production loss matrices depending on the current ice load, icing intensity, and wind speed. The matrices were derived for one wind park from two months of data during 2010, where reliable icing measurements were available, and are used here for all wind parks.

Period and verification data
The study is based on two winter periods, here referred to as periods 1 and 2. The NWP-model was run for period 1: 20 December 2013-28 February 2014 and for period 2: 1 September 2014-12 January 2015, initializing a 42-h-long forecast at 0600 UTC, with hourly output. The forecast values from 118 to 41 h are used as the ''next day'' forecast in the evaluation of the forecast production losses.
Observations are available from four wind parks in Sweden. For contractual reasons, the wind park locations cannot be specified, but they lie between latitudes 608 and 688N at a height between 250 and 1000 m above sea level. All sites are located on a hill with lower or similar terrain to the surroundings. The number of wind turbines varies at each park, but in the verification the production of the running wind turbines was averaged to a mean single turbine in each wind park. None of the wind turbines have ice protection systems. Temperature and wind speed measured at turbine nacelle, and production data from each wind turbine were used in the verification. At some of the sites also ice measurements were conducted using the IceMonitor from Combitech AB (Combitech 2016). The ice load measurements are, however, very uncertain and were therefore not included in the verification. Instead the forecast production loss is compared against the observed production loss in order to assess the quality of the icing forecasts. The observed production losses are calculated from the ratio between observed and potential production given the observed wind speed from the nacelle. It should therefore represent the icing-related production loss alone, since the local wake and blockage effects are included in the observed wind speed. Production loss has previously been used for detection of icing in Davis et al. (2016a). However, some additional uncertainty arises in the verification, since the forecast potential production and production loss in percent assume no wake and blockage effects that can give large positive bias in the derived production losses. To limit this effect, the unbiased forecast error, or the standard deviation of the difference between the forecast and observation, is used in the verification of the production loss forecasts [STD 5 (RMSE 2 2 bias 2 ) 1/2 ].
The forecast wind speed and the observed wind speed at nacelle height were occasionally found to differ considerably during icing periods. The bias was 21.5 and 20.4 m s 21 and the wind speed RMSE was 2.3 and 1.9 m s 21 for periods 1 and 2, respectively. Note that the wind speed observations of insufficiently heated instruments are uncertain during icing events because of possibly iced instruments. Table 1 shows the wind speed bias for the four sites without icing (observed production loss 5 0) and during icing (observed production loss $ 0.5 MW), for periods 1 and 2 combined. Sites B and C show a large difference in forecast bias during icing events. Because of this uncertainty in the actual bias and RMSE of the forecast wind speed, no bias correction was performed here. Since the observed and forecast wind speed are used to calculate the observed and forecast potential production, the occasionally large difference between observed and forecast wind speed complicates the verification of the production forecast. The focus in the verification is therefore on the production loss forecast and not the resulting production forecast.

Uncertainties in the ice model
On the basis of a literature study and knowledge about the icing model, five parameters were assessed to be perturbed in the ice model ensemble: the ice shedding factor (IFP), the wind erosion factor (WE), the MVD, the Nusselt number Nu, and the sticking efficiency a 2 for snow or graupel (here called b). Two of them affect the ice growth (MVD and b), two of them affect the ice loss (WE and IFP), and one of them influences both the ice growth and the ice loss (Nu). The parameters and their estimated uncertainty distribution in terms of mean and standard deviation are described in more detail below and as shown in Table 2. A more detailed explanation of how the values of the uncertainties are estimated can be found in section 5a. Note that the icing model ensemble can easily be extended to include more uncertain parameters, if their respective uncertainties can be estimated.
During melting, the IFP describes the amount of ice that will fall off the wind turbine blade. It is included in the calculations of melting as a constant that is multiplied to the melting rate. Because of incomplete knowledge, we made a rough estimate for the uncertainty of this parameter. The mean is 8, and the standard deviation is set to 3.5.
The WE accounts for ice falling of the blades as a result of strong winds, and is given by 10 g m 21 (m s 21 ) 21 . Davis et al. (2016b) use a representative value for wind erosion that is based on statistical studies, but they emphasize that this value is uncertain. The uncertainty is not known for this parameter, but the standard deviation was estimated to be 4.4 g m 21 (m s 21 ) 21 to allow for almost 100% variation from the mean value.
MVD is used in the calculations of the collision efficiency a 1 , the accretion efficiency a 3 , and the ice density. MVD is calculated using the particle number concentration Nd for liquid or solid water content. It has been shown to have a large effect on the amount of ice that is built up but not on the start and end of an icing period (Davis et al. 2014). For the cloud droplets, there have been measurement campaigns to quantify the variation of Nd in Scandinavia (e.g., Ahmad et al. 2013) that show variations of 50-600 cm 23 , with an average of 300 cm 23 . From this, a rough estimation of uncertainty for Nd and therefore for MVD could be done. However, for the other hydrometeors, especially snow and graupel, the uncertainty is more difficult to estimate in our case. In our model, the Nd for snow and graupel is determined from the mass concentration calculated from the mixing ratio for the hydrometeors, which is an output from the NWP model (Seity et al. 2011). From this an MVD is estimated that can yield unrealistic values, since the snowflake density is not included in the calculations. Therefore, it was decided to perturb the MVD by multiplying a constant 1 with a standard deviation allowing a variation of 100% from the mean value. Note that perturbing MVD does not change the water content W in our icing model.
The Nusselt number Nu is calculated as NuC 3 Redd 0.85 , where Redd is the Reynolds number (a function of the air density, the wind speed, and the diameter of the ice) and NuC is a constant used in the calculations of the accretion efficiency a 3 ; Nu is a part of the calculation of the heat transfer coefficient. It is also used in the calculations of sublimation, or the phase change of ice to gas. The heat transfer coefficient is discussed as an uncertain part of icing modeling in Makkonen (2000). In Wang (2008) it is shown that Nu varies with different angle of attack and liquid water content. It is assumed in the standard icing model that the angle of attack is 0 since the model is based on a vertical rotating cylinder. Assuming a wind turbine blade, however, the angle of attack could vary since the ice could accumulate in various shapes. In Makkonen (2013) it is shown that a part of the formula for calculating the Nusselt number, NuC, varies from 0.032 and 0.007 with angle of attack. Wang (2008) suggests even more variation of the Nusselt number, with also a higher NuC than 0.032. Here the mean is set to 0.03 and the standard deviation of the constant is set to 0.015.
The sticking efficiency for snow and graupel, or b, differs from that for rain and cloud droplets in the icing model. In the original setup for the icing model, it is calculated with b 5 1/U C , where C is 0.75. This parameter has been discussed in, for example, Nygaard et al. (2013) as a highly uncertain part of the icing model when including snow and graupel. In ISO (2001) it is described in the following way: Estimation of the sticking efficiency b of wet snowflakes is presently quite inaccurate. The relation b 5 1/U should be seen only as a first approximation until more sophisticated methods to estimate b have been developed. This is mostly because there can be a large variation of how sticky snowflakes are, depending on the amount of liquid water content and the shape. In Nygaard et al. (2013), values for C of 1 and 0.5 were considered to be reasonable. Here we use C 5 0.75 because it is the average of the two discussed in Nygaard et al. (2013), with a standard deviation of 0.22.

a. Overview
The aim here was to use uncertainty quantification to generate a wind turbine icing model ensemble based on the assumed probability density functions, or uncertainty distributions, of the uncertain parameters in the icing model. The uncertainty quantification of the icing model then propagates uncertainty of model parameters to the overall model result (Fig. 3) and the forecast uncertainty for the power production resulting from the icing model itself is estimated.
The icing model ensemble can be generated using random sampling (RS; Monte Carlo) or deterministic sampling (DS). The sampling generator is in the latter case substituted with a deterministic rule. The essential difference is that DS gives an accurate representation of the uncertain parameter PDF, with a few optimally sampled ensemble members, while RS gives an approximate representation of PDF derived from a sufficiently large number of ensemble members. It deserves to be emphasized that there is no contradiction in representing statistical information with deterministic samples. On the contrary, any distribution of a statistical population is a deterministic representation.
An additional benefit of using DS instead of RS is the few ensemble members required, while RS may be perceived as a simpler method and thus preferable for some purposes. DS was used here to be able to later extend the uncertainty quantification to include other parts of the modeling chain without requiring higher computational and data management costs. The perceived complexity of using DS is also limited to finding a sampling rule, not in its application to a model. The sampling rule presented in section 5b can thus be recycled in other applications without requiring the more complicated part of finding a DS method. We emphasize that RS would give similar results as DS. This is more discussed in section 6b.
The DS method was chosen to represent the mean, standard deviation, and kurtosis of the uncertainty distribution of uncertain parameters described in section 4. Higher statistical moments were ignored, both because they are hard to estimate and because we assume that the effect of including them would be negligible.
Some of the uncertain parameters are bounded (e.g., negative values are not allowed). Thus the uncertainty distribution was truncated symmetrically to avoid skewness. With these limitations, the kurtosis, the fourth moment of the distribution, can differ from that of the normal distribution [kurtosis 5 M (4) 5 3]. The uncertainty distributions, the PDFs of the parameters with modified standard deviation and kurtosis, were calculated as follows: 1) A mean and standard deviation of the uncertain parameter were assumed. 2) A sample of one million random elements was drawn from a Gaussian distribution, with M (4) 5 3, with the assumed mean and standard deviation. 3) Not-allowed samples were removed from the million samples, and a new standard deviation and kurtosis was calculated.
The uncertainty distributions of the different parameters get different kurtosis using this method, between M (4) 5 2.12 and M (4) 5 2.37 for the five parameters. This was not taken into account here. Instead, the highest kurtosis from the five parameters was used as an upper bound of the uncertainty. Based on the new kurtosis, the values and weight of each ensemble member can be determined according to the theory from section 5b. Furthermore, we assume zero covariance between the different uncertain parameters. Using this chosen DS method results here in an ensemble of eight perturbed members and one control member. The number of members results from the use of Hadamard matrices further described in section 5b. The control member run uses the mean for each parameter from Table 2. To determine the size of the perturbation for each ensemble member, a multiplication factor A is calculated according to Eq. (7). A weight W for the perturbed members and a weight W 0 for the control member given in Eq. (7) are used to calculate the statistical moments from the ensemble. The ensemble members are perturbed at each time step. The sign of the perturbation of each parameter in each ensemble member is presented in Fig. 5, which is described in more detail below. The size of the perturbation is then A times the standard deviation. For this study we derive, based on Eq. (7) presented and described below, the following values for the DS approach: A ' 3/2, W ' 1/18, and W 0 ' 9/18. This type of DS provides the first and second statistical moments of the resulting probabilistic icing forecast, assuming moderate model nonlinearities. An example of a forecast with the ensemble mean production loss together with a probabilistic forecast in terms of 62 standard deviations, representing 95% of the uncertainty distribution assuming Gaussian distribution, can be seen in Fig. 4. Note that in the following sections the ensemble standard deviation will be referred to as ensemble spread since this is a commonly used expression in the meteorological community.

b. Deterministic sampling theory
The aim of the proposed version of DS is to create an ensemble S [n3m] of size m, representing the first (mean) and second statistical moments (cov) of n parameters u exactly: where the matrix D denotes a nonunique square root of the covariance matrix cov(u) and subscripts [a3b] indicate matrix size. As a first step, the covariance matrix is eigenvalue decomposed to provide an orthogonal basis U for the ensemble perturbations with its respective standard deviations S: where UU T 5 U T U 5 I and S i,j6 ¼i 5 0. Here I is the identity matrix. Different representations of the square root of the covariance matrix can be found by inserting a so-called excitation matrix V: where VV T 5 I and V Á 1 [m31] /m 5 0. The ambiguity of the DS ensemble S is here carried by the excitation matrix V [n3m] (Hessling 2013). This means that Eq. (5) holds for any excitation matrix satisfying VV T 5 I, and a specification of V is thus not unique. The insertion of UU T 5 I is included to let ensemble S resemble the matrix V as much as possible, apart from scaling with S and displacement with mean(u). The excitation matrix V [n3m] will be a generic ensemble defined by the current sampling rule and can be constructed in various ways. The only requirements are the asymmetric orthogonality and zero mean constraints described above. Its minimal size is m $ n 1 1, since 1 degree of freedom is lost to the zero mean constraint. By choosing a proper matrix V some control is gained beyond the exact representation of first and second moments. An excitation matrix V can be found from any Hadamard matrix (Pregibon 1981), all the rows of which are orthogonal but have one row/column with nonzero mean. Excluding that row and normalizing, a proper excitation matrix V HAD results. The DS ensembles have specific sizes. As an example relevant for this article, for five uncertain parameters the Hadamard matrix of size 5 3 8 is required. Five of seven zero-mean rows in the 8 3 8 Hadamard matrix are then chosen to represent perturbations of the five parameters (Fig. 5). Depending on the output quantity of interest but also model complexity, it may be important to also control higher statistical moments such as skewness and kurtosis. Limiting the ensemble to represent moments of symmetric variations, the generalized skewness described by all third moments should vanish. For matrices V consisting of only elements 61 as the Hadamard matrices, zero skewness in this case results from the zero mean constraint. Since the Hadamard ensemble has minimal kurtosis and thus has an extreme representation of ignored moments, it may be important to expand the ensemble to represent a more correct kurtosis. In practice, the kurtosis FIG. 4. Production loss observations (red) and forecast in megawatts including ensemble mean (black) and plus/minus up to 2 standard deviations as based on ensemble spread (gray areas).
is almost never accurately known. A kurtosis M (4) may often be imposed to any ensemble V [n3m] with uniform weights W 5 (1/m) [m31] , using the three-parameter approach of ensemble rescaling V / AV, weight renormalization W /WW, and adding a zero perturbed sample 0 [n31] with weight W 0 . That is, with respect to marginal moments of order e up to 4, solve for A,W, and W 0 , where 8 > < > : since V de HAD W 5 1 [n31] , and where fV de [n 3 m] g ij 5 V e ij , i 5 1, 2, . . . , n, j 5 1, 2, . . . , m. The dot in the matrix exponent denotes the Hadamard product; that is, the exponent is applied element by element. For instance, for m 5 5 and M (4) 5 3 (as for the kurtosis of the normal distribution), A 5 (3) 1/2 ,W 5 1/3 /WW 5 1/15, and W 0 5 2/3. Basically, A is the value that enlarges the parameter perturbation,WW is the weight for each ensemble member above and in the following referred to as W, and W 0 is the weight for the nonperturbed member, or the control member.

Results
The validation of the icing model itself is difficult because of unreliable icing observations. Instead, the production loss observations, production observations combined with an estimated potential production, are used as a proxy to validate the quality of the icing forecast. However, an example of forecast ice load using the ensemble mean from the deterministic sampling ensemble compared to the control member can be seen in Fig. 6. For this period four icing events occurred: three where the ensemble mean shows less and one with higher amount of icing compared to the control member. Around 27 December, the ensemble mean still has some ice left after an icing period, while the ice is already removed in the control member. The effect on the production loss of these differences depends on the current wind speed.

a. Sensitivity analysis
The sensitivity of the icing and production model to changes in the five parameters was studied using the variation of the ensemble spread. Note that the two winter periods are not assumed to represent the climatological normal, and thus the sensitivity shown here is not seen as the overall sensitivity.
By comparing the forecast spread when perturbing each of the five parameters separately, an estimation of the model sensitivity to perturbations of each parameter was obtained. In Table 3 the average ensemble spread when perturbing each of the five parameters can be seen. The Nu-perturbation sensitivity is also divided into sublimation and accretion efficiency, since this parameter affects both parts of the model. The result in Table 3 includes all forecast times from 18 to 41 h for which production losses were forecast.
The largest response on the average production loss is seen when perturbing b, Nu, and MVD. The perturbations of Nu have a larger effect on the sublimation than on the accretion efficiency. Regarding the WE and IFP perturbations, the model seems to be less sensitive to changes in these parameters on average. However, since IFP is used only during melting and there were relatively few periods with temperatures above 08, it is not FIG. 5. The excitation matrix of the Hadamard type used for the five uncertain parameters P 1-5 ; 1-8 are the eight perturbed ensemble members, 11 and 21 denote the direction of the perturbation, or 61 standard deviation times the factor A in Eq. (7). expected to affect the results much for these particular periods. Note that the high sensitivity of the icing model to perturbations in b might be a specific characteristic of this model, since ice accretion due to the frozen hydrometeors is allowed at temperatures below 08C for reasons discussed in section 2b. The response for the combination of perturbations (''All'' in Table 3) is almost the same as when the squared mean spread for the separate sensitivities is added, suggesting a linear regime. Although the perturbations of MVD, Nu, and b are combined in the nonlinear term of our icing model [Eq. (2)], it seems that the perturbed parameters act mostly independently on the icing. Thus, our assumption of neglecting cross covariance between parameter uncertainties seems to be justified.
Even though the perturbations of WE and IFP yield smaller changes to the ice loss individually than the Nusublimation part, their contribution can be occasionally more beneficial in the fully combined perturbation than using only the Nu sublimation. This can be seen in Fig. 7 in which the three individual effects on the resulting ensemble mean production loss forecast for a one-week period at one site are shown (Figs. 7, left and top-right panels) together with the combined effect (Fig. 7, bottomright panel) in comparison with the control member. In this case we see the best agreement with observations when including all ice loss perturbations. In the second half of the period, one can clearly recognize the difference between only Nu perturbations (Fig. 7, bottom left panel) and the combined perturbations (Fig. 7, bottomright panel), both around 12 January and at the end of the period, although the WE and IFP perturbations show no impact on their own (Fig. 7, top panels). For IFP and WE, it was tested whether the model sensitivity could be increased with larger perturbations. Even if a larger response on the production loss forecast could be achieved, this led occasionally to unrealistically large ice losses.
In summary, we can state that the employed icing model is most sensitive to perturbations in b, Nu, and MVD for the examined period. However, IFP and WE provide also nonnegligible impact, especially in combined perturbations. Nevertheless, the sensitivity analysis shows closely linear behavior, as the variances from Table 3 can be added to yield the variance of the combined perturbations, thus justifying the application of the uncertainty quantification method.

b. Deterministic and random sampling
The resulting ensemble mean and spread of the production loss using the deterministic sampling ensemble is compared with a random sampling ensemble using the same estimated uncertainty distributions for the uncertain parameters. This is done to illustrate the similar results from both methods. Ten thousand random sampling points are used in the random ensemble forecast. Figure 8 illustrates the two methods for one site during period 1. On average the ensemble mean production loss is slightly higher for the random sampling than for the deterministic sampling: 0.52 and 0.49 MW, respectively. For the ensemble spread the difference is almost nonexistent, with an average of 0.27 MW for both methods. Although deterministic sampling was designed to represent only the mean and standard deviation of the parameter uncertainties, it provides almost the same forecasting result in terms of ensemble mean and spread as random sampling. This means that, with only a very small effect on the results, deterministic sampling can be used to reduce the number of ensemble members for the ice model ensemble, from thousands to only nine ensemble members.
In the following, we further analyze the performance of the deterministic sampling.

c. Ensemble mean and spread
The ensemble mean can be used as a single forecast and then be compared with one deterministic forecast, here the control member, using common verification methods. An example of production loss forecast using the control member and the ensemble mean is shown in Fig. 9 (top panel). Visually, the ensemble mean as expected improves the forecast compared to the control member in this case. This is a result of the nonlinearity of the model, since in a fully linear model the impact of the perturbations would average out to zero, leaving the unperturbed forecast as the mean result.
The forecast skill in terms of RMSE for the ensemble mean and control member can be compared. The unbiased forecast error [STD 5 (RMSE 2 2 bias 2 ) 1/2 ] is used in the verification so as to exclude eventual bias errors made by the ensemble, since those errors could be corrected for given sufficiently long time series. The production loss forecast STD for the two winter periods is shown in Table 4. The improved skill using the ensemble mean forecast is seen for both winter periods and for all sites. The differences between the control member and the ensemble mean are significant in three of the four sites on a 95% confidence level. The statistical significance of the difference between the control member error and the ensemble mean error for all data was tested with a t test, assuming normal distribution and taking into account a 40-h autocorrelation of the forecast errors. On average over all sites, using the ensemble mean forecast instead of the control member reduces the STD by 12% and 21% for the two periods respectively. Both numbers are statistically significant on a 95% confidence level. The reduction of the forecast error in period 1 using the ensemble mean was found to be of similar magnitude for the icing model ensemble and as when using a single icing model with an NWP ensemble as input (as in Molinder et al. 2018).The icing model ensemble also resulted in a similar average spread for the production loss forecasts as with the NWP ensemble for all sites in period 1, stressing the importance of considering the uncertainties in the icing model.
Note that a large part of the increased forecast skill is achieved because the ensemble mean filters out less predictable features, that is, smoothing the forecast, which is one of the benefits using an ensemble forecast. The bottom panel in Fig. 9 shows the ensemble spread for the same period as the forecast shown in the top panel of Fig. 9. The spread of the ensemble should in an ideal ensemble forecast represent the forecast uncertainty and provide probabilistic estimations that can be used for risk assessments and decisions. Using the ensemble spread as an uncertainty estimation, the forecast would be more uncertain for 25 February than for 23 February, for example. The spread often increases with increasing production loss.
A common verification of the ensemble verification is the spread-skill relationship (e.g., Scherrer et al. 2004) where optimally the ensemble spread and skill of the ensemble mean forecast is equal. The spread-skill relationship as a function of the forecast length between 18 to 41 h using the production loss forecast is provided in Fig. 10. As expected, since not all uncertainties in the modeling chain are accounted for in the ensemble, the ensemble is underdispersive; that is, the ensemble spread is smaller than the skill of the ensemble mean or the ensemble forecast is overconfident. It has been shown in previous studies that combining ensembles accounting for different uncertain parts of the modeling chain increases the forecast spread and thus a perfect spread when accounting only for the ice model uncertainties would be unreasonable (Molinder et al. 2018). However, a receiver operating characteristic (ROC) diagram can show the benefit of the ensemble even though underdispersive. This kind of verification score is binary and requires a yes/no limit, for example, production losses or none. Since most sites experience ice during most of the two periods, ice/no ice would not be a reasonable limit to use here. Instead, the limit of production losses greater than 0.5 MW is used, which occurs around 20% of the time during these periods. Figure 11 displays the ROC diagram for all stations and all sites combined. It can be seen that the ensemble increases the hit rate without a similar increase of false alarm rate, meaning that the ensemble is meaningfully constructed. For a perfect ensemble, the area under the curve (AUC) score would be equal to 1. Here, the AUC score is 0.74 as compared with 0.63 if using only the control member, meaning that there is a benefit of using the ensemble information as a probabilistic forecast if 0.5 MW is used as limit. This score differs when choosing different limits, but the AUC is larger for the DS ensemble for all tested limits between 0.1 and 1.3 MW.

Summary and discussion
The uncertainties of an icing model and their impact on forecasts of icing-related production losses have been quantified. Uncertain parameters in the icing model were assessed based on literature and personal communication.
The identified parameters are the median volume diameter of the hydrometeors, the sticking efficiency for snow and graupel, the Nusselt number, the shedding factor, and the wind erosion factor. To assess the impact of these uncertainties, we employed a method of uncertainty quantification, namely, deterministic sampling. To our knowledge, this is the first time that a systematic assessment is done on how combined icing model uncertainties affect the icing forecast. For a complete uncertainty quantification in the modeling of icing-related production losses, all uncertain parts of the modeling chain have to be accounted for. This will be addressed in a future study. With deterministic sampling, a nine-member icing model ensemble can be generated with similar results as using the more commonly known method of random sampling requiring thousands of ensemble members. Deterministic sampling can thus be used to lower the computational and data management costs. Even though most icing models generally have low computational costs, the authors believe that for some purposes it FIG. 8. (top) Ensemble mean forecast of production loss (MW) using deterministic and random sampling for period 1. (bottom) The respective ensemble spread. Random sampling used 10 000 samples. Both panels show the ensemble mean using deterministic sampling (black) and random sampling (green).  can be valuable to lower the number of ensemble members without loss in quality. For a full uncertainty modeling of the entire modeling chain from Fig. 1, each ensemble member of the NWP ensemble approach requires its own icing model ensemble, leading to a multiplication of the ensemble sizes. This would cause considerable costs, if computational time is limited and fast delivery required. The deterministic sampling method presented here can also be used for other modeling purposes, such as for more complex models with higher computational costs or for parameter calibration, where an exact representation of the parameter uncertainty is beneficial.
The main conclusions of this study can be summarized as follows: 1) The icing and production loss model is more sensitive to perturbations of the median volume diameter of the hydrometeors, the sticking efficiency for snow and graupel, and the Nusselt number on average, while perturbing the wind erosion and shedding factors only have an effect on the results occasionally. The low sensitivity of the wind erosion factor in the icing model is also discussed in Davis et al. (2014). 2) The resulting ensemble mean forecasts improve the forecast skill relative to a deterministic forecast with the same settings, implying that the generation of the ensemble was successful and that the identified uncertain parameters are important for the icing model. 3) The ensemble forecast can be used to estimate forecast uncertainty and as a probabilistic forecast for decision-making but displays too-low spread values, thus signaling an overconfident forecast. We emphasize that these results are valid for the employed icing model and can be different if different icing models are used.
The low spread is expected since not all uncertain parts of the modeling chain are accounted for in this ensemble forecast. Some tests were performed (not shown here) where the deterministic sampling icing model ensemble was combined with an initial condition NWP ensemble or a neighborhood method (Molinder et al. 2018), suggesting that such a combination can further improve both the forecast skill and spread.
By using observations of icing or icing-related production losses, the ice model parameter uncertainty distributions can be tuned such that the icing model better represents the observed uncertainty (Kennedy and O'Hagan 2001). This can possibly further improve the uncertainty quantification method. FIG. 11. ROC diagram of the production loss forecasts for the threshold of 0.5 MW. The ensemble forecast is given with the black dots and line, and the blue line and marker show the control member. All sites and both periods were included. The gray area marks the area of no skill.