1. Introduction
Most of the data assimilation methods being used nowadays to produce the best possible initial condition for numerical weather prediction (NWP) models make use of the assumption that short-term forecast of the model is free of systematic errors, which are collectively referred to as bias. However, as noted by Mass et al. (2008), virtually all NWP models possess substantial systematic errors. Bias in the model makes the data assimilation ineffective as it violates the basic assumption of zero mean and random errors. The biased model also leads to a biased analysis even with unbiased observations (Dee and Da Silva 1998). Since observations also tend to hold their own biases (e.g., Tenenbaum 1996; Wang et al. 2002; Dee and Uppala 2009), correction of the model’s bias through data assimilation is difficult even with a fair amount of observations. Therefore, lessening the model’s bias is of paramount importance for robust data assimilation.
In practical data assimilation applications, observation bias relates to model bias even though they are irrelevant in nature. A good example in this regard is the variational bias correction for satellite radiance data (Derber and Wu 1998; Dee 2005). In the method, the model’s own state is used as an unbiased reference to uncover bias in observations. Bias in the satellite radiances, for instance, can be introduced through many processes including instrument characterization; sensor calibration; and collection, processing, and archiving of the data. If uncorrected, the observation bias depreciates value of the observation and has the potential to damage the quality of analysis and forecast. Therefore, the estimated observation bias is strongly affected by the model bias. Regarding this fact, the bias correction against the model state is sometimes restricted to areas near radiosonde observations (e.g., Harris and Kelly 2001) so as to reduce the contamination due to model bias. As addressed by Kobayashi et al. (2009), it is the presence of large biases in the assimilating model that causes spurious shifts and other artifacts in the analysis, even if all assimilated observations are unbiased and correctly represented by the assimilation system.
Systematic model errors are attributed to diverse sources. Besides the bias in the initial condition, the numerical model is imperfect in the governing equations, numerics, surface forcing, and parameterizations of unresolved physical processes. Despite the extensive efforts to understand and reduce model bias in the literature, dealing with the model bias is still an enduring challenge for model development. This is probably due to the large variety of possible sources of modeling errors and nonlinear interaction among them. On the other hand, there could also be some tractable sources of model bias. If such sources exist, eradicating them makes it a lot less complicated to track down remaining sources. Therefore, understanding, identifying, and eliminating obvious sources of model bias are vital in dealing with the imperfect model.
While making efforts to optimize a version of the Weather Research and Forecasting (WRF) model that is to be used for the Arctic System Reanalysis (ASR; Bromwich et al. 2010), we have noticed inexplicable model biases in the forecast compared with verifying global analyses, which are persistent over the period and horizontally unvarying. The biases in temperature and geopotential height appear at the upper part of the model domain and increase with the altitude. We tracked down the biases and found that the initial conditions possess the same biases even against the very global analysis that was used to generate the initial condition. Since creating the initial condition from other models’ data is merely viewed as a set of interpolation procedures, the initial bias was quite surprising and led us to speculate the source of the disparity. Tracing further back to the sources, we learned that the geopotential bias is due to a particular discretization of the hydrostatic equation used in the model. The hydrostatic equation is used in the model to diagnose the specific volume (for nonhydrostatic runs) or geopotential (for hydrostatic runs). Accuracy in the diagnosed variables depends on how the integration of the hydrostatic equation is carried out. In this study, we will show how the discrete hydrostatic equation leads to a substantial model bias. The model uses potential temperature as a prognostic variable. Accordingly, the potential temperature is interpolated from global analyses to the grids of the limited-area model to provide the initial condition, leading to a marked bias in the temperature. Given that model errors at high altitude have long been a problem for global models (e.g., Trenberth and Stepaniak 2002; Kobayashi et al. 2009) and for limited-area models (e.g., Wee and Kuo 2004), the two biases in the model may have been considered as such. As will be described later, the above-mentioned sources are hardly viewed as problematic since the procedure and assumption that lead to the systematic errors are seemingly legitimate and reasonable. Section 2 offers a brief description of the numerical model and data used in this study. The biases in geopotential height and in temperature will be described along with suggested remedies in sections 3 and 4, respectively. The summary and conclusions will be presented in section 5.
2. Numerical model and data
The numerical model used in this study is the Advanced Research WRF (ARW; Skamarock et al. 2008; hereafter referred to as WRF model) version 3.2. The WRF is a fully compressible and nonhydrostatic model and is thus suitable for a broad spectrum of applications across a broad range of scales. The prognostic variables are column mass of dry air, three-dimensional components of wind, potential temperature, and geopotential. Diagnostic variables (e.g., temperature, pressure, and density) are derived from the prognostic variables. The model also permits hydrostatic runs. In this case, the geopotential is diagnosed through the hydrostatic equation. Although the WRF model allows forecasts to be made on a global scale (more information can be found online at http://www.ncar.ucar.edu/feature/articles/wrf_global.php), in this study it is used as a limited-area model. Accordingly, the WRF needs to obtain its initial and lateral boundary conditions from other models that cover an area wider than the WRF domain, usually a global analysis or forecast. This involves interpolating another model’s data to WRF grids, and the systematic model errors we found are pertinent to but not limited to this procedure. Hereafter the procedure is referred to as “initialization” for the sake of convenience, and it differs from the initialization techniques that adjust the initial condition of a numerical model in order to reduce initial imbalance.
The vertical coordinate used in WRF is a terrain-following hydrostatic-pressure coordinate η = (p − pt)/μ, where μ = ps − pt is proportional to the mass of the air in a vertical column of the model domain with p being the hydrostatic pressure, and ps and pt refer to the pressures at the model’s surface and top, respectively. In this study 10 hPa is used for the pressure at the top of model. The result we are presenting in this study is sensitive to vertical resolution; hence, we endeavor to take a typical example. As the baseline, we pick up a set of η values that has 28 full levels (hereafter L28). It is the one provided with the WRF source code as an example of η specification for tutorial purposes. To demonstrate the dependency of model bias on resolution, we also use 55 full levels (hereafter L55), where each layer of L28 is halved in depth. The depth of each layer in η for both L28 and L55 is shown in Fig. 1.

Vertical distribution of η depth of model layers along (a) the level index and (b) the approximate pressure. The η depth is the difference in η between two adjacent full η levels. Dashed lines are for the model with 28 vertical layers (L28) and solid lines are the model with 55 layers (L55).
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1

Vertical distribution of η depth of model layers along (a) the level index and (b) the approximate pressure. The η depth is the difference in η between two adjacent full η levels. Dashed lines are for the model with 28 vertical layers (L28) and solid lines are the model with 55 layers (L55).
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1
Vertical distribution of η depth of model layers along (a) the level index and (b) the approximate pressure. The η depth is the difference in η between two adjacent full η levels. Dashed lines are for the model with 28 vertical layers (L28) and solid lines are the model with 55 layers (L55).
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1
To provide the initial condition for the WRF, we used the European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis Interim (ERA-Interim; Dee et al. 2011) for a month: December 2006. The data has a spectral T255 horizontal resolution, which corresponds to approximately 79-km spacing on a reduced Gaussian grid. The vertical resolution is 60 model layers on a hybrid sigma-pressure vertical coordinate with the model top located at 0.1 hPa. The ERA-Interim data also provided lateral boundary conditions for the WRF, updated with 6-h intervals. The model domain used here is a coarse-resolution replica of the ASR domain, centered at the South Pole with 30-km horizontal grid spacing. We also used mean soundings that were obtained from averaging the ERA-Interim for 10 Decembers during the years 2000–09 and over the ASR domain (hereafter referred to as climatology). The ERA-Interim data used in this study is representative of external datasets (XDS) from which the initial and lateral boundary conditions of WRF are to be obtained. A description on the other details of the model setting irrelevant to this study is omitted.
3. Negative bias in geopotential height













Structure of the vertical grid of the WRF model in which solid (dashed) lines represent full (half) η levels. Geopotential is defined on full levels, while potential temperature and specific volume are defined on half levels. Pressure is available on both full and half levels.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1

Structure of the vertical grid of the WRF model in which solid (dashed) lines represent full (half) η levels. Geopotential is defined on full levels, while potential temperature and specific volume are defined on half levels. Pressure is available on both full and half levels.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1
Structure of the vertical grid of the WRF model in which solid (dashed) lines represent full (half) η levels. Geopotential is defined on full levels, while potential temperature and specific volume are defined on half levels. Pressure is available on both full and half levels.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1


This again implies that (6) is superior to (3) in estimating the thickness. In summary, although (2) and (5) are equivalent with each other and have no problem mathematically, discrete forms (3) and (6) differ in their accuracy. To demonstrate the difference between (3) and (6), a climatology of temperature (described in section 2) is interpolated to a vertical grid of the WRF model. We then integrate (3) and (6) upward starting from the lowest level to calculate the height of full model levels from the interpolated temperature on half model levels. Figure 3 shows z3 − z6, where z3 and z6 are the geopotential heights computed using (3) and (6), respectively. The difference between the geopotential heights explains that (3) leads to a significant negative bias that increases with the height. Comparing results for L28 (dashed) and L55 (solid), we observe a strong dependency on the vertical resolution.

Difference between geopotential heights (m) computed using two discrete hydrostatic equations: on model levels of L28 (dashed) and L55 (solid). The vertical profile of z3 − z6 is shown, where z3 and z6 are the geopotential heights computed using (3) and (6), respectively. The hydrostatic equations are integrated upward starting from the lowest level to the model top. A mean sounding of temperature (monthly mean and domain averaged) is interpolated to the model levels and then used as the input for the hydrostatic integrations.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1

Difference between geopotential heights (m) computed using two discrete hydrostatic equations: on model levels of L28 (dashed) and L55 (solid). The vertical profile of z3 − z6 is shown, where z3 and z6 are the geopotential heights computed using (3) and (6), respectively. The hydrostatic equations are integrated upward starting from the lowest level to the model top. A mean sounding of temperature (monthly mean and domain averaged) is interpolated to the model levels and then used as the input for the hydrostatic integrations.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1
Difference between geopotential heights (m) computed using two discrete hydrostatic equations: on model levels of L28 (dashed) and L55 (solid). The vertical profile of z3 − z6 is shown, where z3 and z6 are the geopotential heights computed using (3) and (6), respectively. The hydrostatic equations are integrated upward starting from the lowest level to the model top. A mean sounding of temperature (monthly mean and domain averaged) is interpolated to the model levels and then used as the input for the hydrostatic integrations.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1









Permillage error (m km−1) of ΔΦ3 against ΔΦ8, defined as 1000(ΔΦ3 − ΔΦ8)/ΔΦ3, as function of δp/p. Here, ΔΦ3 and ΔΦ8 are thicknesses of a model layer in geopotential computed with (3) and (8), respectively. See text for details.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1

Permillage error (m km−1) of ΔΦ3 against ΔΦ8, defined as 1000(ΔΦ3 − ΔΦ8)/ΔΦ3, as function of δp/p. Here, ΔΦ3 and ΔΦ8 are thicknesses of a model layer in geopotential computed with (3) and (8), respectively. See text for details.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1
Permillage error (m km−1) of ΔΦ3 against ΔΦ8, defined as 1000(ΔΦ3 − ΔΦ8)/ΔΦ3, as function of δp/p. Here, ΔΦ3 and ΔΦ8 are thicknesses of a model layer in geopotential computed with (3) and (8), respectively. See text for details.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1
Meanwhile, the η depth used in this study is shallow near the surface and at the model top, and deeper in the middle troposphere (Fig. 1). This differs from typical practices in the past. For instance, in the era of the fifth-generation Pennsylvania State University–National Center for Atmospheric Research (PSU–NCAR) Mesoscale Model (MM5; Dudhia 1993), better vertical resolution is usually only used for the lower part of the model so as to better resolve the planetary boundary layer, but a relatively coarse resolution is used elsewhere (e.g., Wee and Kuo 2004). As more attention is paid to the benefit of adding more layers to the upper part of the model atmosphere (e.g., Kimball and Dougherty 2006), the general trend at present is to use denser layers for the high altitudes. However, there is no theoretic explanation yet that supports this trend. While earlier studies (e.g., Zhang and Wang 2003; Kimball and Dougherty 2006) accentuate forecast sensitivity to the vertical resolution, (11) entails that the poor resolution has adverse effects as it causes a significant model bias. A test with an MM5-type η specification showed an enormous bias in the geopotential height (not shown).
We now consider the problem in practical WRF initialization and forecast. In creating initial conditions for WRF, (3) is used to diagnose geopotential height. Unless any other information is added to enhance the initial condition (e.g., data assimilation, or any physical or dynamical initialization techniques to improve balances among the model state variables), the diagnosed geopotential height is supposed to be identical with the geopotential in XDS. However, when the two geopotential heights are compared with each other by interpolating the WRF initial condition back to the grids of XDS or vice versa, a systematic difference appears. Figure 5a shows the bias of the WRF initial condition against the ERA-Interim analysis for L28 averaged over the domain of the WRF model and over the month-long period. The initial conditions are generated with a 12-h interval, at every 0000 and 1200 UTC. Being solely determined by the dry hydrostatic pressure on the model level as shown in (11), the bias is horizontally uniform if the undulation of η surface due to the model’s topography is neglected. This is why the area- and time-averaged bias of z3 shown in Fig. 5a (dashed line) is about the same as that shown in Fig. 3, for which a single climatic sounding is considered. For this reason, the bias is by and large insensitive to the geographical location of the model domain. When (8) is used instead of (3), the bias is almost completely removed (solid line in Fig. 5a).

Comparison of bias in geopotential height (m) for the (a) initial conditions and (b) 1-day forecasts. The model levels used here are from L28. The bias is against ERA-Interim analysis with (solid) and without (dashed) the bias correction. Profiles are monthly mean and domain averaged.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1

Comparison of bias in geopotential height (m) for the (a) initial conditions and (b) 1-day forecasts. The model levels used here are from L28. The bias is against ERA-Interim analysis with (solid) and without (dashed) the bias correction. Profiles are monthly mean and domain averaged.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1
Comparison of bias in geopotential height (m) for the (a) initial conditions and (b) 1-day forecasts. The model levels used here are from L28. The bias is against ERA-Interim analysis with (solid) and without (dashed) the bias correction. Profiles are monthly mean and domain averaged.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1
The bias correction made in the initial condition by using (8) does not eradicate the bias in the forecast. The WRF model allows the forecast to be made either in hydrostatic or nonhydrostatic mode at run time. In both cases, the model uses (3) to relate Φ and α while making the forecast. For the hydrostatic run, (3) is used to diagnose Φ from α in the forecast, via other prognostic variables. On the contrary, (3) is used to diagnose α from predicted Φ for the nonhydrostatic run. As long as (3) is used for the model in one way or another, the geopotential height with respect to the pressure is biased in the same way. Even after the initial bias was removed, the bias comes back quickly in the forecast and soon reaches the same level as in the case where correction was not made (dashed line in Fig. 5b). By using (8) instead of (3), the 24-h forecasts show practically no bias in the predicted geopotential height (solid line in Fig. 5b). The WRF model has a tendency to produce slight negative bias at the lower part of the model domain and slight positive bias at stratospheric altitudes. However, their magnitudes are negligible compared with the initial bias due to the use of (3).
The biases assessed for L55 are shown in Fig. 6. As the results without the bias correction are compared (dashed lines in Figs. 5 and 6), the doubled vertical resolution reduces the bias to one-fourth of L28 in both the initial condition and forecast [note the difference in scale of the model bias (x axis) between Figs. 5 and 6]. This confirms the resolution dependency of the bias presented in (11). However, the increase in the resolution is ineffective in addressing the bias. On the contrary, the correction (i.e., use of the revised discrete hydrostatic equation) eliminates the still-existing considerable bias in L55 almost completely. The small remaining bias in the forecast of L55 with the bias correction reflects the total forecast bias with combined contributions from all other dynamical and physical sources as well as uncertainties in the verifying analysis. When the remaining bias is compared with the bias presented in the case where the correction is not made, the latter exceeds the former. Therefore, if such an overwhelming bias exists, it would be impossible to identify, understand, and correct other subtle sources of bias. In other words, rectifying one source of bias provides a solid foundation to study other sources as well.

As in Fig. 5, but for model levels of L55.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1

As in Fig. 5, but for model levels of L55.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1
As in Fig. 5, but for model levels of L55.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1
In an idealized simulation of a hurricane, Kimball and Dougherty (2006) showed for the MM5 model that the storm’s intensity is sensitive to the distribution of vertical levels. They emphasized the importance of well-resolved stratospheric outflow in the storm development and addressed strong need for well-established guidelines for vertical resolution in the model. Their study suggests that an adequate number of model levels has to be used for the (otherwise assumed less important) stratospheric altitudes to achieve a realistic simulation of the hurricane. Our study indicates that an increased vertical resolution partly reduces the bias in the height of the WRF model as well. The result shown in Fig. 5 also suggests that the use of the revised discrete hydrostatic equation in (8) is very effective in reducing the model bias even for a lower vertical resolution.
4. Warm bias in temperature









Geometry associated with vertical interpolation of potential temperature. Solid lines represent levels of global analysis, while the dashed line in the middle denotes the level of the WRF model to which potential temperature is to be interpolated. Subscripts “u” and “l” denote upper and lower levels of the layer of the global analysis, respectively. The relative distances in pressure from the upper and lower levels to the WRF level are denoted as du and dl, respectively. Likewise, wu and wl stand for weightings given to the levels. Note that the weighting is inversely related to the distance in the linear interpolation.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1

Geometry associated with vertical interpolation of potential temperature. Solid lines represent levels of global analysis, while the dashed line in the middle denotes the level of the WRF model to which potential temperature is to be interpolated. Subscripts “u” and “l” denote upper and lower levels of the layer of the global analysis, respectively. The relative distances in pressure from the upper and lower levels to the WRF level are denoted as du and dl, respectively. Likewise, wu and wl stand for weightings given to the levels. Note that the weighting is inversely related to the distance in the linear interpolation.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1
Geometry associated with vertical interpolation of potential temperature. Solid lines represent levels of global analysis, while the dashed line in the middle denotes the level of the WRF model to which potential temperature is to be interpolated. Subscripts “u” and “l” denote upper and lower levels of the layer of the global analysis, respectively. The relative distances in pressure from the upper and lower levels to the WRF level are denoted as du and dl, respectively. Likewise, wu and wl stand for weightings given to the levels. Note that the weighting is inversely related to the distance in the linear interpolation.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1




This means that if δp/pu is negligibly small then the maximum error appears at half pressure level of the layer. Otherwise, the location approaches to the upper bound from the half pressure level. Obviously, the error due to the interpolation vanishes when the WRF level coincides with one of the bounds since either wl = 0 or wu = 0 in (16).
As the weights make it difficult to appreciate each term in (16), we consider a simple case that the WRF level is located at the exact middle of a layer (i.e., wl = wu = 0.5). This can be accomplished by interpolating potential temperature from full η levels of the WRF model to half η levels and then by recovering temperature at the location from the interpolated θ and p. To do so, the climatological temperature, used earlier to demonstrate the geopotential bias, is interpolated to full η levels of the WRF model and then potential temperature is computed there. The δp/pu for 28L, shown in Fig. 8a, increases with height as pressure becomes lower. Figure 8b shows that the second term of rhs in (16) evaluated for this case is much larger than the first term. Also shown in Fig. 8b is Tθ − T, where Tθ is the recovered temperature from interpolated potential temperature and T is the temperature directly interpolated to half η levels. The Tθ − T is very close to ɛT. Figure 9 shows dependency of the temperature bias on the resolution of XDS. Here, the same climatological sounding of temperature is interpolated to levels of L55 (solid) and L28 (dashed) in order to mimic two different resolutions of the XDS. The result for L28 is the same as that shown in Fig. 8, but duplicated for ease of comparison. It is clear from the comparison that use of a higher-resolution XDS reduces the bias as shown in (16).

(a) Vertical profile of δp/pu on levels of L28. (b) Comparison of the first (solid) and second (dashed–dotted) terms in rhs of (16). Heavy gray line represents the total temperature error that includes all error terms (except for the first term) in rhs of (15), which is again equivalent with Tθ − T (see text for details). The mean sounding of temperature (monthly mean and domain averaged) interpolated to the full levels of L28 is used to evaluate the terms. The total error represents the maximum possible temperature bias for L28.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1

(a) Vertical profile of δp/pu on levels of L28. (b) Comparison of the first (solid) and second (dashed–dotted) terms in rhs of (16). Heavy gray line represents the total temperature error that includes all error terms (except for the first term) in rhs of (15), which is again equivalent with Tθ − T (see text for details). The mean sounding of temperature (monthly mean and domain averaged) interpolated to the full levels of L28 is used to evaluate the terms. The total error represents the maximum possible temperature bias for L28.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1
(a) Vertical profile of δp/pu on levels of L28. (b) Comparison of the first (solid) and second (dashed–dotted) terms in rhs of (16). Heavy gray line represents the total temperature error that includes all error terms (except for the first term) in rhs of (15), which is again equivalent with Tθ − T (see text for details). The mean sounding of temperature (monthly mean and domain averaged) interpolated to the full levels of L28 is used to evaluate the terms. The total error represents the maximum possible temperature bias for L28.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1

Dependency of the temperature bias on the resolution of global analysis. Here, the climatic sounding of temperature is interpolated to levels of L55 (solid) and L28 (dashed) in order to mimic two different resolutions of the global analysis. Results for L28 are as in Fig. 8, but duplicated here to highlight the resolution dependency. (a) As in Fig. 8a, but for the result for L55 is included. (b) The maximum possible bias in temperature, Tθ − T.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1

Dependency of the temperature bias on the resolution of global analysis. Here, the climatic sounding of temperature is interpolated to levels of L55 (solid) and L28 (dashed) in order to mimic two different resolutions of the global analysis. Results for L28 are as in Fig. 8, but duplicated here to highlight the resolution dependency. (a) As in Fig. 8a, but for the result for L55 is included. (b) The maximum possible bias in temperature, Tθ − T.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1
Dependency of the temperature bias on the resolution of global analysis. Here, the climatic sounding of temperature is interpolated to levels of L55 (solid) and L28 (dashed) in order to mimic two different resolutions of the global analysis. Results for L28 are as in Fig. 8, but duplicated here to highlight the resolution dependency. (a) As in Fig. 8a, but for the result for L55 is included. (b) The maximum possible bias in temperature, Tθ − T.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1
Note that the temperature error shown here is close to the largest possible value because the interpolation is done to the half level of each layer. In actual WRF initializations, the error diminishes in size and its vertical structure gets intricate. Figure 10 shows the temperature biases in the initial conditions of L28 and L55 due to the θ interpolation, which are against the ERA-Interim analysis used for initialization and averaged over the model domain and over the month-long period. Different from what is shown in Figs. 8 and 9, the bias in the initial conditions does not increase monotonically with the height. Instead noticeable zigzags appear in both resolutions, caused by alternating the distance of WRF model levels to the levels of the global analysis. That is to say, if a certain level of WRF is close to one of the levels of the global analysis, the bias is small. On the contrary, if the WRF model level approaches the middle of a layer of the global analysis, the bias gets bigger. Another feature worth mentioning is that the biases of L28 and L55 are about the same in size regardless of the WRF resolution. The reason for this is that δp/pu in (17) refers to the parameter of the global analysis. This again indicates that the bias is irrelevant to the resolution of the WRF model and can be reduced by using higher-resolution XDS. The zigzags in the bias suggest that the interpolation of the potential temperature brings artificial oscillations in the initialized temperature. Differently from the bias in geopotential height, the temperature bias is caused only in the initialization. The bias in the initial condition remains the same in the forecast (not shown). The bias is caused by interpolating potential temperature despite it being a conserved variable in adiabatic flow and thus preferred in many model formulations than temperature. The bias is remedied simply by interpolating temperature and pressure individually, and then by computing potential temperature from the interpolated variables. By doing so, the warm bias in the initial condition is completely removed and the potential temperature does not possess any systematic errors.

Profiles of the temperature bias (K) against the ERA-Interim analysis in initial conditions of L28 (dashed) and L55 (solid). Monthly mean and domain-averaged statistics are shown for which potential temperature is interpolated from the global analysis to the WRF grids in the initialization. Use of the temperature in the initialization does not produce any bias in the temperature (other than interpolation error) and thus is not shown here.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1

Profiles of the temperature bias (K) against the ERA-Interim analysis in initial conditions of L28 (dashed) and L55 (solid). Monthly mean and domain-averaged statistics are shown for which potential temperature is interpolated from the global analysis to the WRF grids in the initialization. Use of the temperature in the initialization does not produce any bias in the temperature (other than interpolation error) and thus is not shown here.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1
Profiles of the temperature bias (K) against the ERA-Interim analysis in initial conditions of L28 (dashed) and L55 (solid). Monthly mean and domain-averaged statistics are shown for which potential temperature is interpolated from the global analysis to the WRF grids in the initialization. Use of the temperature in the initialization does not produce any bias in the temperature (other than interpolation error) and thus is not shown here.
Citation: Monthly Weather Review 140, 12; 10.1175/MWR-D-12-00045.1
5. Summary and conclusions
In the course of the work that motivated this study, we noticed considerable systematic differences between a WRF forecast and the verifying global analysis. However, we were not sure about the existence of the biases presented in this study until they were also found in the validation of initial conditions. Since generating a WRF initial condition from another model’s data is considered to be a set of interpolation procedures, the apparent biases in the initial condition against the very input data used for the initialization were surprising enough to lead us to look into the disparity. Even after their existence was acknowledged, it was not easy for us to relate the perceived biases to the actual error sources since the procedure and assumption behind the biases [i.e., the discrete hydrostatic equation (3) and the direct interpolation of potential temperature to WRF grids] are seemingly proper and barely considered to be erroneous. Eventually, we have identified and corrected two sizeable biases in the WRF model. They are a negative bias in the geopotential and a warm bias in the temperature, appearing both in the initial condition and in the forecast. The biases increase with height and thus manifest themselves at the upper part of the model domain.
The geopotential bias is caused by the discrete hydrostatic equation used in WRF, which disregards the innate vertical structure of the specific volume (or inverse density) that is unequivocally convex when viewed in the model (or pressure) coordinate. As the equation is used not only for generation of the initial and lateral boundary conditions but also for model integration as a part of the governing equations, the bias quickly revives in the forecast even after it was removed from the initial condition. Hydrostatic and nonhydrostatic runs are very similar with each other in the bias. Magnitude of the bias is analytically found to be proportional to the square of the thickness (in height) of the model layers. For the vertical levels used in this study, the negative bias is found to be about 450 m for L28 and 120 m for L55 at the model’s top. The bias is fixed by using an alternative form of the discrete hydrostatic equation. The validation against ERA-Interim analysis illustrates that the correction almost completely removes the bias in the initial condition and in the forecast regardless of the vertical resolution used for the WRF model.
The WRF model uses the potential temperature as a prognostic variable. Accordingly, the potential temperature is interpolated from the grids of XDS to the WRF grids to provide the initial condition. This leads to a marked bias in the temperature due to the pressure weighting (i.e., the Exner function). Interestingly, through an analytical derivation, it is found that the magnitude of the temperature bias is dependent on the vertical resolution of the XDS rather than on that of the WRF model itself. This means that use of a higher-resolution XDS for the initialization reduces the bias. The size of the bias for each half level of WRF is also dependent on the relative distance between the WRF level to the two nearest levels of the XDS. The bias grows as the WRF level approaches the middle of the layer between the two levels, while it vanishes when the WRF level coincides with one of the levels. The distance dependency introduces artificial oscillations in the model temperature. By interpolating temperature in the initialization and then computing potential temperature on the WRF grids, the temperature bias is removed.
As to explaining and correcting the model bias, our approach is analytical. While the validation conducted in this study by means of comparing corrected and uncorrected model states with verifying global analyses offers a demonstration, our findings in this study are not restricted by the uncertainties associated with errors in the verification data that are more or less unavoidable. Previous studies do not provide a consensus as to whether a bias correction can improve forecast skill in measures other than the bias. Some showed positive impact of bias correction on random components of forecast error (e.g., Johansson and Saha 1989; Yang and Anderson 2000; Danforth et al. 2007), but others did not (e.g., DelSole and Hou 1999; Saha 1992; DelSole et al. 2008). In our test, the bias correction developed in this study showed no more than a minor improvement in reducing random errors; however, we have no doubt that the bias corrections will moderate the disparity between the forecast and observations and will eventually lead to improved analysis and forecast in the cycle of subsequent data assimilation. The bias corrections might be especially beneficial to assimilating height-based observations (e.g., radio occultation data) as well as temperature measurements and satellite radiances pertinent to temperature.
Taken together, we find that the two biases stem from a common root, which is the convexity in the vertical structure of the relevant atmospheric parameters. The vertical structures of specific volume (pressure and density as well) in the atmosphere are strictly convex, and thus when integrating it numerically along the vertical coordinate (i.e., the discrete hydrostatic equation) it becomes susceptible to bias. While temperature is not convex in its structure, the associated Exner function makes potential temperature convex. Consequently, vertical interpolation of the potential temperature leads to a bias in the temperature. Larson et al. (2001) also showed that the convexity in the Kessler autoconversion formula leads to biases in certain microphysical and thermodynamic quantities. Therefore, we speculate that the convexity-related bias is omnipresent in various components of the model to which not enough attention has been paid.
WRF is open to the public freely as a community model and has numerous worldwide users in addition to the researchers who participated in the development and validation. Although the model has been used and reviewed by so many users, the biases have remained unknown since the first official release of WRF in 2004 until unveiled in this study. Therefore, we hazard a conjecture that this type of bias could be ubiquitous in NWP systems, and more likely for the models with less public involvement. Under the conditions given in this study, the geopotential bias presented here far exceeds the gross 1-day forecast bias combining all other dynamical and physical sources. Hence, it would not be peculiar to suppose that, even in other numerical models, this class of modeling errors could dominate other sources remaining unresolved, uncharacterized, or simply unknown, at least where short-term weather prediction and analysis are concerned. Instead of considering the findings in this study as mere technical issues specific to the WRF model, we would like to address the need of fostering intense scrutiny into hidden sources of model error in the broader modeling community.
Acknowledgments
This material is based upon work supported by the National Science Foundation (NSF) Division of Arctic Sciences under Cooperative Agreement ATM-0301213/CSA ARC-0733058, by the NSF Division of Atmospheric and Geospace Sciences under Cooperative Agreement AGS-0918398/CSA AGS-0939962, and by the National Aeronautics and Space Administration under Award NNX08AN57G issued through the Earth Science Division, Science Mission Directorate. One of the authors is supported by the Global R&D Center Program (GRDC) of the National Research Foundation of Korea (NRF) sponsored by the Korea Ministry of Education, Science and Technology (MEST). We thank the ECMWF and the Data Support Section of NCAR for providing the ERA-Interim data.
REFERENCES
Adcroft, A., R. Hallberg, and M. Harrison, 2008: A finite volume discretization of the pressure gradient force using analytic integration. Ocean Modell., 22, 106–113.
Bjerknes, V., and J. W. Sandström, 1910: Dynamic meteorology and hydrography. Part I: Statics. Publ. 88, Carnegie Institute, Washington, DC, 146 pp.
Bromwich, D. H., Y.-H. Kuo, M. Serreze, J. Walsh, L.-S. Bai, M. Barlage, K. Hines, and A. Slater, 2010: Arctic System Reanalysis: Call for community involvement. Eos, Trans. Amer. Geophys. Union, 91, 13–14.
COESA, 1976: U.S. Standard Atmosphere, 1976. NOAA, 227 pp.
Danforth, C. M., E. Kalnay, and T. Miyoshi, 2007: Estimating and correcting global weather model error. Mon. Wea. Rev., 135, 281–299.
Dee, D. P., 2005: Bias and data assimilation. Quart. J. Roy. Meteor. Soc., 131, 3323–3343.
Dee, D. P., and A. M. Da Silva, 1998: Data assimilation in the presence of forecast bias. Quart. J. Roy. Meteor. Soc., 124, 269–295.
Dee, D. P., and S. Uppala, 2009: Variational bias correction of satellite radiance data in the ERA-Interim reanalysis. Quart. J. Roy. Meteor. Soc., 135, 1830–1841.
Dee, D. P., and Coauthors, 2011: The ERA-Interim reanalysis: Configuration and performance of the data assimilation system. Quart. J. Roy. Meteor. Soc., 137, 553–597.
DelSole, T., and A. Y. Hou, 1999: Empirical correction of a dynamical model. Part I: Fundamental issues. Mon. Wea. Rev., 127, 2533–2545.
DelSole, T., M. Zhao, P. A. Dirmeyer, and B. P. Kirtman, 2008: Empirical correction of a coupled land–atmosphere model. Mon. Wea. Rev., 136, 4063–4076.
Derber, J. C., and W.-S. Wu, 1998: The use of TOVS cloud-cleared radiances in the NCEP SSI analysis system. Mon. Wea. Rev., 126, 2287–2299.
Dudhia, J., 1993: A nonhydrostatic version of the Penn State/NCAR Mesoscale Model: Validation tests and simulations of an Atlantic cyclone and cold front. Mon. Wea. Rev., 121, 1493–1513.
Duthie, W. D., 1946: Numerical integration of the hydrostatic equation. J. Meteor., 3, 89–94.
Harris, B. A., and G. Kelly, 2001: A satellite radiance-bias correction scheme for data assimilation. Quart. J. Roy. Meteor. Soc., 127, 1453–1468.
Johansson, A., and S. Saha, 1989: Simulation of systematic error effects and their reduction in a simple model of the atmosphere. Mon. Wea. Rev., 117, 1658–1675.
Kimball, S. K., and F. C. Dougherty, 2006: The sensitivity of idealized hurricane structure and development to the distribution of vertical levels in MM5. Mon. Wea. Rev., 134, 1987–2008.
Kobayashi, S., M. Matricardi, D. P. Dee, and S. Uppala, 2009: Toward a consistent reanalysis of the upper stratosphere based on radiance measurements from SSU and AMSU-A. Quart. J. Roy. Meteor. Soc., 135, 2086–2099.
Larson, V. E., R. Wood, P. R. Field, J.-C. Golaz, T. H. Vonder Harr, and W. R. Cotton, 2001: Systematic biases in the microphysics and thermodynamics of numerical models that ignore subgrid-scale variability. J. Atmos. Sci., 58, 1117–1128.
Mass, C. F., J. Baars, G. Wedam, E. Grimit, and R. Steed, 2008: Removal of systematic model bias on a model grid. Wea. Forecasting, 23, 438–459.
Saha, S., 1992: Response of NMC MRF Model to systematic-error correction within integration. Mon. Wea. Rev., 120, 345–360.
Skamarock, W. C., and Coauthors, 2008: A description of the Advanced Research WRF version 3. NCAR Tech. Note NCAR/TN-475+STR, 125 pp. [Available online at http://www.mmm.ucar.edu/wrf/users/docs/arw_v3.pdf.]
Tenenbaum, J., 1996: Jet stream winds: Comparison of aircraft observations with analyses. Wea. Forecasting, 11, 188–197.
Trenberth, K. E., and D. P. Stepaniak, 2002: A pathological problem with NCEP reanalyses in the stratosphere. J. Climate, 15, 690–695.
Wang, J., H. L. Cole, D. J. Carlson, E. R. Miller, K. Beierle, A. Paukkunen, and T. K. Laine, 2002: Corrections of humidity measurement errors from the Vaisala RS80 radiosonde—Application to TOGA COARE data. J. Atmos. Oceanic Technol., 19, 981–1002.
Wee, T.-K., and Y.-H. Kuo, 2004: Impact of a digital filter as a weak constraint in MM5 4DVAR: An observing system simulation experiment. Mon. Wea. Rev., 132, 543–559.
Yang, X.-Q., and J. L. Anderson, 2000: Correction of systematic errors in coupled GCM forecasts. J. Climate, 13, 2072–2085.
Zhang, D.-L., and X. Wang, 2003: Dependence of hurricane intensity and structure on vertical resolution and time-step size. Adv. Atmos. Sci., 20, 711–725.