## Abstract

In land data assimilation, bias in the observation-minus-forecast (*O* − *F*) residuals is typically removed from the observations prior to assimilation by rescaling the observations to have the same long-term mean (and higher-order moments) as the corresponding model forecasts. Such observation rescaling approaches require a long record of observed and forecast estimates and an assumption that the *O* − *F* residuals are stationary. A two-stage observation bias and state estimation filter is presented here, as an alternative to observation rescaling that does not require a long data record or assume stationary *O* − *F* residuals. The two-stage filter removes dynamic (nonstationary) estimates of the seasonal-scale mean *O* − *F* difference from the assimilated observations, allowing the assimilation to correct the model for subseasonal-scale errors without adverse effects from observation biases. The two-stage filter is demonstrated by assimilating geostationary skin temperature *T*_{skin} observations into the Catchment land surface model. Global maps of the estimated *O* − *F* biases are presented, and the two-stage filter is evaluated for one year over the Americas. The two-stage filter effectively removed the *T*_{skin }*O* − *F* mean differences, for example, the Geostationary Operational Environmental Satellite (GOES)-West *O* − *F* mean difference at 2100 UTC was reduced from 5.1 K for a bias-blind assimilation to 0.3 K. Compared to independent in situ and remotely sensed *T*_{skin} observations, the two-stage assimilation reduced the unbiased root-mean-square difference (ubRMSD) of the modeled *T*_{skin} by 10% of the open-loop values.

## 1. Introduction

Within the context of data assimilation, “bias” refers to errors in modeled or observed variables that persist over time and/or space. Standard bias-blind data assimilation methods are based on the assumption that neither the forecast model nor the observations are biased, and these methods will produce suboptimal output in the presence of bias (Dee and Da Silva 1998). Unfortunately, the forecast models and observation datasets used in Earth system applications, including for the land surface, typically are biased (Dee and Todling 2000; Reichle et al. 2004). Observation biases can arise from errors in the observing instrument and its calibration, the observation operator, or the retrieval model, as well as representativity errors between the observed state variables and their modeled counterparts. Likewise, forecast biases can arise from errors in the forecast model structure, parameters, initial conditions, and forcing.

Ideally, the cause of observation and forecast biases should be diagnosed and treated at the source. However, where this is not possible, these biases can also be addressed in data assimilation by applying an observation bias correction prior to assimilation (e.g., Harris and Kelly 2001) or by using a bias-aware assimilation system explicitly designed to correct either observation biases (e.g., Auligné et al. 2007; Fertig et al. 2009) or forecast biases (e.g., Dee and Todling 2000; Keppenne et al. 2005). Bias correction methods require that the bias be observable (Dee and Da Silva 1998), and the ocean and atmosphere examples cited above measure the biases against confident estimates of the true mean state, typically obtained with reference to point-based observations (e.g., ocean buoys and radiosondes). However, the land surface is much more heterogeneous than the ocean and atmosphere, and point-based in situ observations are in general not representative of the coarse-resolution states estimated by remote sensors and land surface models (Crow et al. 2012). Consequently, for large domains, the true mean land surface states are unknown, since there are large systematic differences between the mean (and variance) of different observed and modeled land surface datasets, none of which can in general be identified as having statistics representative of the true state (Reichle et al. 2004).

Since observation and forecast biases cannot be observed for land surface states, it is standard practice to remove the systematic differences between the observed and forecast estimates from land data assimilation, usually by rescaling the observations to be consistent with the long-term mean (and variance, and sometimes higher-order moments) of the forecasts (e.g., Reichle and Koster 2004; Drusch et al. 2005; Scipal et al. 2008; Crow et al. 2011). This prevents these systematic differences from adversely impacting the model state, while satisfying the minimum criterion for optimal bias-blind data assimilation that there be no difference between the mean values of the observed and forecast estimates. The assimilation can then correct the model for random errors that develop during each forecast, where “random errors” are errors persisting over time scales much shorter than the assumed bias time scale. Data assimilation with observation rescaling has been shown to yield land surface estimates that are superior to modeled or observed estimates alone (Slater and Clark 2006; Reichle et al. 2007; Ghent et al. 2010; Crow et al. 2011; Draper et al. 2012; De Lannoy et al. 2012; de Rosnay et al. 2013). This rescaling approach is often referred to as “observation bias correction,” although strictly speaking, it is not the observation bias (defined against the true mean state) that is corrected, but the lumped observation bias minus forecast bias.

The long data record of observed and forecast state estimates required for determining observation rescaling coefficients has slowed the implementation of land data assimilation in large-scale applications, particularly within atmospheric systems, which are frequently updated and yet prohibitively expensive to replay over long periods. Consequently, Dharssi et al. (2011) and de Rosnay et al. (2013) identify the difficulty in obtaining observation rescaling coefficients as one cause of the limited impact of assimilating remotely sensed soil moisture observations into atmospheric models. The long data record requirement also prevents the assimilation of new remotely sensed datasets and necessitates costly reprocessing of the rescaling parameters after significant updates to assimilated datasets.

Consequently, this manuscript presents a method for removing the observation-minus-forecast (*O* − *F*) bias from the observations in a land data assimilation system, without access to a long data record. This is achieved by using a two-stage observation bias and state update estimation filter. The term “bias” is subjective, with regards to the temporal and spatial scales over which it applies. In seeking a bias correction method that does not require a long data record, the bias is necessarily defined over shorter time scales, and the presented two-stage filter dynamically estimates nonstationary *O* − *F* biases that evolve at seasonal time scales.

There are typically large systematic differences between remotely sensed and modeled skin temperature *T*_{skin} (Ghent et al. 2010; Wang et al. 2014), and if not adequately addressed, these differences will result in a suboptimal assimilation, potentially leading to degraded flux forecasts (e.g., Reichle et al. 2010). Hence, the two-stage observation bias and state estimation scheme has been demonstrated here by assimilating geostationary *T*_{skin} observations into the Catchment land surface model.

The remainder of this manuscript is outlined as follows. In section 2, the two-stage observation bias and state estimation scheme is developed and contrasted to observation rescaling approaches. The two-stage filter is then demonstrated with an example assimilation of remotely sensed skin temperature observations into a land surface model. The *T*_{skin} assimilation experiments are outlined in section 3, before the results are presented in section 4. Finally, section 5 presents a summary and conclusions.

## 2. The state and bias filter equations

The two-stage observation bias and state estimation approach introduced here is based on the online two-stage forecast bias and state estimation approach of Dee and Da Silva (1998), which has been successfully implemented in atmosphere (Dee and Todling 2000), ocean (Chepurin et al. 2005; Keppenne et al. 2005), and land (Bosilovich et al. 2007; De Lannoy et al. 2007; Reichle et al. 2010) data assimilation. Following Friedland (1969), Dee and Da Silva (1998) decouple the forecast bias estimation from the state update and use a separate Kalman filter to estimate the forecast bias. The bias-blind state update innovations (i.e., the *O* − *F* residuals) are used to measure the forecast bias for the bias update, based on the assumption that the observations are unbiased, and persistence is used to predict the forecast bias. Pauwels et al. (2013) recently extended the theory of the two-stage forecast bias and state estimation filter to simultaneously estimate separate observation and forecast biases. In their approach, demonstrated with synthetic experiments, the bias-blind state update innovation measures the observation bias plus the forecast bias, which is partitioned into the separate bias terms by calibration. However, observations of the true mean state are ultimately required to partition the sum of the biases (Dee and Da Silva 1998).

In contrast to Dee and Da Silva (1998) and Pauwels et al. (2013), we derive the two-stage filter as if to estimate the observation biases measured using the bias-blind state update innovations based on the assumption that the forecasts are unbiased. However, in the intended land data assimilation applications, the forecasts are almost certainly biased, so that the estimated “observation bias” really represents the lumped observation bias minus forecast bias. The estimated bias is then used to adjust the observations to have the same mean value as the forecast estimates, which is consistent with observation rescaling approaches.

Below, the bias-free ensemble Kalman filter (EnKF) equations are reviewed (section 2a) before the optimal solution for the two-stage observation bias and state estimation filter is derived (section 2b). Then, a parameterization of the Kalman gain for the bias update is introduced, to avoid specifying the unknown prior observation bias uncertainty (section 2c).

### a. The bias-free EnKF

The bias-free EnKF, as implemented by Reichle et al. (2014) for land data assimilation, consists of a model forecast step and a state update step. For the *i*th ensemble member, the state forecast and update at the *k*th assimilation time are

and

where **x** is the model state vector, is the forecast model, **q** represents the model error (or perturbation vector), is the Kalman gain matrix, **y**^{o} is the observation vector, is the observation operator, and *υ* is an applied (zero mean, normal) perturbation representative of the expected observation errors. For simplicity, we assume to be linear; however, the theory is unchanged if this assumption is relaxed. Throughout this manuscript, a superscripted state vector indicates an estimated value, with the minus and plus superscripts indicating the prior and posterior estimates, respectively. In contrast, the absence of a superscript for a state variable indicates the true state vector.

In a bias-free EnKF, the errors in **x**^{−} and **y**^{o} are assumed to have vanishing long-term means and to be uncorrelated with each other. Under these assumptions, **x**^{+} provides an unbiased estimate of **x**, and the optimal (minimum posterior state error variance) Kalman gain for the *k*th state update _{k} is given by

where ^{x−} is the prior model state error covariance matrix and ^{o} is the observation error covariance matrix. The ensemble spread is used to diagnose ^{x−}, while for land data assimilation ^{o} is typically assumed to be constant in time and have zero off-diagonal terms (e.g., Draper et al. 2012). Applying the above equations in the presence of (unknown) observation and/or forecast biases is suboptimal and is referred to as bias-blind data assimilation (Dee and Da Silva 1998).

### b. The two-stage observation bias and state estimation

For an observation-bias-aware assimilation, the observation vector is allowed to have a nonzero mean error persisting over some extended time period (a bias). The biased observations, written , can be partitioned into the bias term _{k} and the remaining zero-mean error component :

The observations are then bias corrected within the state update [Eq. (2)] to remove the bias from the observations, giving an unbiased estimate of **x**^{+}:

where is the Kalman gain for the state update based on the bias-corrected observation vector.

A separate, discrete Kalman filter is then used to estimate the observation bias. The observation bias is measured using the mean *O* − *F *. The bias is initialized at zero, and persistence is used as the bias prediction model, since the bias is assumed not to change significantly during individual assimilation cycles. The persistence model is recognized as an approximation, since a (potentially desirable) feature of the two-stage filter is the nonstationary nature of the bias estimates. The observation bias forecast and update equations for the *k*th assimilation time are then written:

and

where _{k} is the Kalman gain for the bias update. Equations (7) and (8) provide an unbiased estimate of the observation bias, regardless of the selection of _{k}. Appendix A shows that if the errors in the observations, the prior bias estimate, and the prior state estimate are not correlated with each other, and if provides an unbiased estimate of the observation bias, the optimal (minimum error covariance) posterior bias estimate is obtained with

Here ^{o} is unchanged from Eq. (4) and represents the random errors in the observations only, while is the random error covariance matrix for the prior observation bias estimate.

Substituting the best estimate of the bias [; Eq. (8)] into Eq. (6) then gives the state update equation with observation bias correction:

Up to this point, the presented derivation of the two-stage observation bias and state estimation equations has followed that of Pauwels et al. (2013), with their forecast bias set to zero. However, we now diverge from their approach. In appendix B, we show that if the optimal expression for is used [Eq. (9)], in Eq. (10) is the same as _{k} for the bias-free filter [Eq. (4)]. That is, the Kalman gain is unchanged by the inclusion of the two-stage observation bias estimate in the state update equation. This result parallels that of Dee and Todling (2000), who show that, for the online two-stage forecast bias and state estimation filter, the state update Kalman gain is unchanged by the inclusion of the forecast bias estimate in the state update equation.

To summarize the two-stage observation bias and state estimation filter equations presented above, Eqs. (1) and (10) are used for the state forecast and update, respectively, together with the state update Kalman gain of Eq. (4). Equations (7) and (8) are used for the observation bias forecast and update, respectively, together with the bias update Kalman gain of Eq. (9) [although Eq. (9) will be replaced by an empirical function in section 2c]. For illustrative purposes, substituting Eq. (8) into Eq. (10), then taking the ensemble average, gives

and

where is the Identity matrix. Comparing Eq. (12) to Eq. (8) for the bias update demonstrates how the two-stage filter partitions the into updates to the bias estimate and state estimate.

The presented two-stage observation bias and state estimation filter parallels the online two-stage forecast bias and state estimation of Dee and Da Silva (1998) but differs from the original two-stage estimation approach of Friedland (1969) in that the state update equation is optimized with the bias correction terms included [i.e., the Kalman gain is obtained by optimizing Eq. (10), rather than Eq. (2)]. The resulting two-stage filter is optimal if the various assumptions stated above hold. However, in practice, the filter is unlikely to be optimal, since, for example, the prior state errors and the prior observation bias errors have been assumed uncorrelated, yet both contain information (and errors) from past observations.

### c. Parameterization of the bias gain

The two-stage observation bias correction and state estimation approach outlined above requires the specification of the unknown error covariance matrix ^{b−} for the prior bias estimate to calculate the observation bias update Kalman gain in Eq. (9). Dee and Da Silva (1998) and Pauwels et al. (2013) assumed that the prior forecast bias error covariances were proportional to the prior forecast error covariances, and Pauwels et al. (2013) assumed that the prior observation bias error covariances were proportional to the forecast observation error covariances. We instead replace with an empirical function. This approach is made possible in this study because we do not require ^{b−} for the bias-aware state update Kalman gain, because of the equivalence of the bias-free and bias-aware Kalman gains noted in section 2b.

For the assimilation of a single observation type at a single location, _{k} becomes scalar. For the assimilation of the *j*th location and observation type, we approximate _{j,k} with a function designed to approach one as the time since the last assimilated observation increases:

where Δ*t*_{j,k} is the number of time steps since the most recent observation of type *j* was assimilated and *τ*_{j} is a user-defined parameter representing the *e*-folding time scale of the bias memory for observation type *j*. This function was chosen since it approximates the expected behavior of _{j,k} under two important scenarios. In the first scenario, no observations have been recently assimilated, relative to the assumed time scale of the bias, and there is little information with which to predict . Hence, *λ*_{j,k} is expected to be close to one, as would be predicted by Eq. (13) for large Δ_{j,k}/*τ*_{j}. In the second scenario, observations are being assimilated with some regularity, and random errors in will be dominated by random errors in the sequence used to update (since by definition the persistence model will not introduce significant errors into the bias estimate). However, the bias filter will gradually filter these random errors over time. Hence, if Δ*t*_{j,k} is assumed to generalize the recent availability of observations, Eq. (13) will approximate the increased certainty in (and subsequent reduction expected in _{j,k}) as observations are assimilated more frequently.

The empirical *λ*_{j,k} must adequately account for the first scenario described above, of no recent observations, since from Eq. (12) a large _{k} is necessary in this case to prevent the potentially large errors from being propagated into the model state vector. This situation can occur reasonably regularly, since there are often seasonal-scale gaps in land surface observation records, when atmospheric and/or land surface conditions prevent remote sensing of the land surface. Note the contrast to forecast bias correction, for which one can fall back on a conservative approach of underestimating the forecast bias when the bias estimate is highly uncertain (Dee and Todling 2000; Reichle et al. 2010), since the model state will still be updated toward the true state (defined by the observations in this case).

For the assimilation of multiple observation types and locations, *λ*_{j,k} can be extended in the obvious way to a matrix **Λ**_{k} by setting the *j*th diagonal element of **Λ**_{k} to *λ*_{j,k} and setting the off-diagonal terms to zero (i.e., disregarding potential spatial correlation, or cross correlation between observation types, in the bias updates). A potential weakness of the above parameterization of *λ*_{j,k} is that a estimate based on a single recent observation would be assigned high confidence. Consequently, observations are excluded from the state update when the bias estimate is based on less than two observations within the last *τ*_{j}/2 time steps (although these observations are still used to update ).

### d. Comparison to observation rescaling

The two-stage observation bias and state estimation method presented above treats the systematic differences between observations and forecasts quite differently compared to the observation rescaling methods currently used in many land data assimilation systems. Observation rescaling (Reichle and Koster 2004; Drusch et al. 2005; Scipal et al. 2008; Crow et al. 2011) is designed to remove the long-term systematic differences in the mean and variance (and possibly higher-order moments) of the observed and forecast state estimates, where “long term” is defined by the length of the data record used to calculate the rescaling parameters. These systematic differences are typically assumed to be stationary, and a static set of bias correction parameters is used. Consequently, a bias-free data assimilation with observation rescaling will then adjust the model states to reduce residual differences between the observations and model forecasts. Such differences include those occurring at subseasonal time scales, differences in the phase of the seasonal cycle, and also interannual differences in the seasonal cycle, if the data record used to estimate the rescaling coefficients was sufficiently long to sample the climatological interannual variability.

In contrast, the two-stage observation bias and state estimation method presented here is designed to remove only the systematic difference in the mean of the observed and forecast state estimates, and this mean difference is not restricted to being stationary. The filter dynamically estimates the bias in the *O* − *F* residuals based only on measurements up to the current assimilation cycle, with greater weight placed on more recent measurements. The filtered bias estimates are nonstationary and will evolve at a time scale determined by the *τ* parameter in Eq. (13). Specifying *τ* to represent seasonal time scales will result in the observations being adjusted to match the seasonal cycle of the forecast estimates. The assimilation will then adjust the model state vector to reduce differences between the observations and forecasts at subseasonal time scales, somewhat consistent with the observation rescaling approach. Although systematic differences in the variance of the observations and forecasts are not explicitly removed, as they are in observation rescaling, the component of variance due to seasonal, or longer, time scale dynamics will be addressed. Finally, while the filter estimates are based on a small number of measurements, by explicitly accounting for errors the filter can quickly converge to reasonably confident bias estimates.

For a given data assimilation experiment, the suitability of the two-stage filter depends on the distribution of the systematic differences between the observed and forecast estimates. For *T*_{skin}, there can be large differences between the mean values of different model forecast and observed estimates (Wang et al. 2014); however, *T*_{skin} variability is reasonably well constrained, due in part to the tight coupling between *T*_{skin} and the (comparatively well observed) low-level atmospheric temperature. Hence, using the two-stage observation bias and state estimation to adjust the seasonal cycle of the observed *T*_{skin} to match that of the forecast estimates is expected to effectively address the systematic differences between observed and forecast *T*_{skin} in an assimilation. However, for many other land surface variables, this approach may not be sufficient. Most notably, for near-surface soil moisture, there are large systematic differences between the variability of different datasets, including their subseasonal-scale variability [e.g., see Draper et al. (2013), their Fig. 2]. This is due in part to the absence of global datasets constraining the possible soil moisture range and the subsequent uncertainty in the parameters controlling the soil moisture response to atmospheric forcing (specifically controlling the total volume of pore space available for water storage in the soil column).

## 3. Skin temperature assimilation

The two-stage observation bias and state estimation scheme has been demonstrated by assimilating geostationary *T*_{skin} observations into the Catchment land surface model. Two separate assimilation experiments were performed. First, the *T*_{skin} data were assimilated over the Americas at 0.3125° × 0.25° longitude by latitude resolution, from 1 June 2012 to 31 May 2013. Second, to obtain example global maps of the *O* − *F **T*_{skin} biases, the *T*_{skin} data were assimilated globally, at a coarser resolution of 0.625° × 0.50°, from 1 May to 1 August 2012.

### a. Catchment land surface model

Catchment (Koster et al. 2000) is the land surface modeling component of the Goddard Earth Observing System, version 5 (GEOS-5; Rienecker et al. 2008). The Catchment model variable equivalent to remotely sensed *T*_{skin} is the surface temperature *T*_{surf}, defined as the average temperature of the canopy and soil surface, and representative of an arbitrarily thin layer separating the canopy and soil surface from the atmosphere. While the Catchment *T*_{surf} is prognostic, it has a very short memory over most land surface types because of its very low surface specific heat capacity (200 J K^{−1} m^{−2}, except for broadleaf evergreen vegetation). The assimilation experiments were performed offline (i.e., decoupled from the atmospheric model), using hourly meteorological forcing data from the National Aeronautics and Space Administration (NASA) Modern-Era Retrospective Analysis for Research and Applications (MERRA; Rienecker et al. 2011), Catchment model parameters from the routine GEOS-5, and a 15-min model time step. The initial land surface state was spun up from an archived GEOS-5 restart file on 1 January 2000 by integrating the model forward (without perturbations) to 1 January 2012, and the model ensemble was then spun up from 1 January 2012 to the start of the assimilation on 1 June 2012.

### b. Geostationary skin temperature data

The assimilated *T*_{skin} observations are retrieved from geostationary thermal infrared (TIR) brightness temperature observations at the NASA Langley Research Center (Scarino et al. 2013). The *T*_{skin} data are retrieved every 3 h and reported on the 0.3125° × 0.25° GEOS-5 model grid. The geostationary data have been produced in near–real time since 2011, from a constellation of satellites providing global (53°S–53°N, after quality control) coverage: Geostationary Operational Environmental Satellites (GOES)-East, GOES-West, the *Multifunctional Transport Satellite-2* (*MTSAT-2*), *Fengyun-2E* (*FY-2E*), and *Meteosat-9* (*Met-9*). However, for the assimilation experiment over the Americas domain, an updated dataset from GOES-East and GOES-West, produced with the latest retrieval model, has been used. Where observations are available from more than one geostationary satellite, only the observations from the closest satellite were assimilated. The observation quality control discards observations with a viewing zenith angle greater than 60°, a solar zenith angle between 83° and 90°, a gridcell cloud fraction above 20%, or if the land modeling system indicates precipitation or a snow-covered surface.

Figure 1 shows the coverage of the observation quality-controlled (GOES-West and GOES-East) *T*_{skin} observations assimilated in the Americas experiment, as a fraction of the total number of possible observation times (eight 3-hourly observation times per day). There are few observations available during colder periods, due mostly to increased cloudiness. Hence, the coverage is very low (<15% of the maximum possible coverage) at higher latitudes. The coverage is also low over the Amazon, again due to cloudiness. There is some diurnal variation in the coverage, with slightly more observations available during the daytime hours (10% more than nighttime). In section 4, evaluation statistics are only reported at locations where observations were assimilated for at least 7.5% of the possible observation times at each time of day (~30 observations).

### c. Assimilation system

The state update component of the two-stage filter is an EnKF (Reichle et al. 2014) that uses a spatially distributed analysis update, 12 ensemble members, and 3-hourly assimilation of the *T*_{skin} observations. The assimilation update vector consists of *T*_{surf} and the ground heat content (GHT1) associated with the near-surface (0–10 cm) soil temperature. The ensemble was generated as in the bias-blind *T*_{skin} assimilation experiments of Reichle et al. (2010), adapted for our inclusion of the top-layer ground heat content in the state update vector. GHT1 was included in response to a change in the definition of *T*_{surf} in the Catchment model. In the model version used by Reichle et al. (2010), *T*_{surf} represented a 5-cm layer depth, while in the current version, the *T*_{surf} specific heat capacity is much lower, so that *T*_{surf} represents an arbitrarily thin layer, and the temperature of the top soil layer is instead modeled by GHT1. The dominant forcing inputs and the model prognostic variables in the state update vector were perturbed as outlined in Table 1. The forcing perturbations were applied hourly, while the prognostic variables were perturbed at each model time step. Depending on the variable, normally distributed additive perturbations or log-normal distributed multiplicative perturbations were applied. Time series correlations were incorporated into the applied correlations using a first-order autoregressive model, and cross correlations between the perturbations to different fields were based on the expected balance between radiation, clouds, and air temperature. Finally, the observation error standard deviations for the *T*_{skin} retrievals were set at 1.3 and 2.1 K during the nighttime and daytime, respectively, which implies that the model forecasts and observations have roughly similar skill.

The Catchment model divides each model grid cell into multiple computational elements, and the assimilation was performed with a spatially distributed (3D) analysis update filter [with nonzero horizontal forecast and observation error correlations (Reichle and Koster 2003)] to ensure that information from the observations is spread to all computational elements within the surrounding grid cell. For both the observation errors and the (forcing and state vector) ensemble perturbations in Table 1, relatively short horizontal error correlation lengths with an *e*-folding distance of 0.17° were applied. Note that preliminary experiments with increased horizontal error correlation scales (between 0.5° and 1.0°) degraded the assimilation results, likely because the strong dependence on cloud cover limits the horizontal error correlations of estimated *T*_{skin}.

The observation bias update was performed independently at each model grid cell (i.e., using a 1D filter). Since there is a strong diurnal cycle in the *O* − *F* mean difference (as will be shown in section 4), the observation bias was modeled separately at each of the eight diurnal observation times, following Reichle et al. (2010).

### d. Evaluation of assimilation output

The results of the assimilation experiment over the Americas have been evaluated by comparison to independent observations of clear-sky *T*_{skin}, from the in situ Surface Radiation Network (SURFRAD) (Augustine et al. 2005), and from remotely sensed Moderate Resolution Imaging Spectroradiometer (MODIS) TIR observations. The six SURFRAD sites shown in Fig. 1 were used (Fort Peck was excluded since the geostationary satellite viewing zenith angle exceeds 60° there). For each of the validation sites, 3-hourly *T*_{skin} were calculated from the SURFRAD upwelling and downwelling TIR radiation observations using the Stefan–Boltzmann equation and broadband emissivity calculated using Wang et al. (2005) from MODIS/*Terra* Land Surface Temperature (LST)/Emissivity (E) Monthly L3 Global 0.05Deg climate modeling grid (CMG) (MOD11C3) narrowband emissivities. For MODIS *T*_{skin} observations, MODIS/*Aqua* LST/E Daily L3 Global 0.05Deg CMG (MYD11C1) and MODIS/*Terra* LST/E Daily L3 Global 0.05Deg CMG (MOD11C1) daily clear-sky *T*_{skin} data have been averaged up to the 0.3125° × 0.25° GEOS-5 model grid. Each is assumed to occur at the geostationary observation time closest to the median MODIS observation time over the domain for each satellite orbit direction.

The skill of the *T*_{skin} assimilation experiment in predicting each of the independent datasets has been compared to the skill of an open-loop (no data assimilation) ensemble, generated with the same modeling perturbations as used for the assimilation experiment. For both cases, instantaneous model *T*_{surf} is compared to the independent *T*_{skin} observations at times for which geostationary *T*_{skin} observations are available (i.e., for the assimilation experiment the posterior *T*_{surf} is evaluated). There are systematic differences between the mean values of the *T*_{skin} datasets used here, and these differences cannot be attributed to biases in any particular dataset. Hence, the evaluation statistic is the unbiased root-mean-square difference (ubRMSD), calculated at each model grid cell after removing the mean difference over the full time period (separately at each time of day) between the datasets.

## 4. Results

### a. Mean O − F differences

Without bias correction, there is a strong diurnal cycle in the mean difference between the observed and forecast *T*_{skin}. For example, Fig. 2 shows the diurnal cycle in the mean *O* − *F* averaged over the Americas over a 1 yr bias-blind assimilation of the GOES-East and GOES-West geostationary *T*_{skin} observations (using the same observation error covariances and forecast ensemble perturbations as for the bias-aware assimilation experiments). For both GOES-East and GOES-West, the *O* − *F* mean differences are more positive after solar noon. The GOES-West *O* − *F* mean differences are consistently positive and larger than those for GOES-East throughout the diurnal cycle, with a maximum value of 5.1 K at 2100 UTC, compared to values <2.0 K during the nighttime. In contrast, the GOES-East *O* − *F* mean differences are negative during the nighttime and positive during the daytime, but with magnitude consistently <1.0 K in both cases, except for the 2.8 K maximum at 1800 UTC. The *T*_{skin} data retrieved from the different geostationary satellite are reasonably well calibrated (Minnis et al. 2002), and the differences between the GOES-East and GOES-West *O* − *F* mean differences in Fig. 2 are almost certainly not related to the sensors themselves, but to the contrasting land covers observed by each. The small spatially averaged *O* − *F* mean differences for GOES-East are due to cancellation between regions of positive and negative *O* − *F* (temporal) mean differences in the spatial means.

While the effectiveness of the observation bias correction has been analyzed throughout the diurnal cycle, for brevity the focus here is on the results at 2100 UTC, when the largest bias-blind *O* − *F* mean differences occurred in Fig. 2. To demonstrate the influence of *τ* (the time scale of the bias estimate) on the bias estimated by the filter (i.e., the ^{+}), Fig. 3 compares the ^{+} time series at 2100 UTC, estimated using *τ* values between 10 and 30 days, at the three SURFRAD locations with the greatest observation coverage. The SURFRAD locations are used only for convenience, and no SURFRAD data were used in these plots. For comparison, each panel also includes a smoothed *O* − *F* time series from the bias-blind assimilation experiment, estimated using the first two annual Fourier harmonics, following Vinnikov et al. (2008). Recall from section 2d, that selecting *τ* to represent seasonal time scales will allow the bias-aware assimilation to correct for subseasonal-scale errors. The bias filter tracks the seasonal-scale *O* − *F* mean differences, while filtering out the higher-frequency noise in the *O* − *F* time series. As expected, the filtered ^{+} time series lag the smoothed *O* − *F* time series, with the lag increasing as *τ* increases in Fig. 3. The minimum time scale of the features resolved by the ^{+} time series also increases as *τ* increases, and for shorter *τ* values, there is more noise around the seasonal cycle (particularly for *τ* = 10 days). The greatest differences between the ^{+} time series with different *τ* (and between the filtered and smoothed time series) occurred at Sioux Falls, where the *O* − *F* seasonal cycle had the steepest temporal gradient. In particular, during the 2012 summer when the *O* − *F* decreased rapidly, the ^{+} time series are much higher than the smoothed time series (likely due to the first two Fourier harmonics in the smoothed time series being insufficient to capture the sharp gradient).

For a given application, the best choice of *τ* for estimating the seasonal-scale *O* − *F* mean differences will depend on the relative variability of the *O* − *F* residuals at seasonal and subseasonal time scales. For geostationary *T*_{skin} assimilation, a compromise value of *τ* = 20 days has been selected, since this produced ^{+} time series with reasonably smooth seasonal cycles that did not lag the smoothed *O* − *F* time series by too much (Fig. 3).

With a *τ* of 20 days, Fig. 4 compares histograms of the *O* − *F* residuals at 2100 UTC at the same three locations plotted in Fig. 3, for both the bias-blind assimilation experiments and the two-stage observation bias and state estimation scheme. As expected, the histograms for the bias-blind assimilation are biased, with mean values between 1.3 and 8.0 K (Figs. 4a–c). The inclusion of the observation bias correction reduced the mean *O* − *F* residual to magnitudes less than 0.5 K at each location (Figs. 4d–f). The observation bias correction also changed the shape of the *O* − *F* histograms in Fig. 4, reducing their spread and skew. Consequently, the standard deviation of the *O* − *F* residuals at each site is reduced, with the greatest reductions occurring at Sioux Falls, from 4.0 K for the bias-blind assimilation to 2.5 K with the observation bias correction. The altered shape of the *O* − *F* histograms is a consequence of the nonstationary bias estimation method accounting for seasonal-scale evolution of the *O* − *F* mean difference. In contrast, if a single (stationary) correction were applied to the mean over the full time period, the higher-order moments of the histograms would have been unchanged.

The histograms in Fig. 4 are representative of the performance of the observation bias correction across the full domain and throughout the diurnal cycle. For example, for both satellites in Fig. 2, the two-stage filter reduced the spatially averaged *O* − *F* mean difference to magnitudes between 0.0 and 0.3 K throughout the day, compared to bias-blind maxima of 5.1 and 2.8 K for GOES-West and GOES-East, respectively. Likewise, the mean standard deviation of the *O* − *F* residuals across the domain was also reduced throughout the diurnal cycle (not shown), for example, from 3.8 to 3.1 K for GOES-West and from 2.7 to 2.1 K for GOES-East, both at 2100 UTC.

Finally, in section 2d, it was hypothesized that the variability of modeled and observed *T*_{skin} estimates is reasonably well constrained, so that using the two-stage filter to adjust the mean seasonal cycle of the observations would be sufficient to address the systematic differences between the observed and forecast estimates. Comparing the variance of the observed and forecast *T*_{skin} confirms that this was the case in the assimilation experiments performed here. For example, over the Americas at 2100 UTC, the spatially averaged temporal standard deviation of the GOES-West observations was 8.0 K, compared to 7.3 K for the model forecasts over the same domain, with a spatial mean absolute difference between their standard deviations of 1.1 K. Likewise, for GOES-East at 2100 UTC, the mean standard deviation over the 1-yr experiment was 5.1 K, compared to 4.9 K for the forecasts, with a spatial mean absolute difference of 0.9 K. The two-stage observation bias correction reduced the differences in the variance, and the bias-corrected observations had spatially averaged standard deviations very close to the model forecasts, of 7.6 K for GOES-West, with a spatial mean absolute difference of just 0.4 K, and of 5.1 K for GOES-East, giving a spatial mean absolute difference of 0.3 K.

### b. Global maps of the estimated O − F bias

Figure 5 shows maps of the estimated ^{+} at 0900 UTC on 1 June, 1 July, and 1 August 2012. There is substantial spatial variation in the ^{+}, with a clear signal of land surface conditions. There are no obvious discontinuities between the ^{+} estimated for adjacent satellites in Fig. 5, although the limited regions of overlapping observations from neighboring satellites (at sufficiently small viewing angles) makes the direct assessment of such discontinuities difficult. At 0900 UTC it is daytime over Africa and Europe, and this region has the largest estimated ^{+} in Fig. 5, with distinct regions of large positive values (>10 K) in the drier regions of Africa, the Arabian Peninsula, and western Asia, with a band of negative values (<−5 K) over equatorial Africa. In contrast, the regions experiencing nighttime generally have smaller ^{+} (magnitude <5 K), except for the drier regions of western North America and Australia, with values of 5–10 K. This tendency for very large positive daytime ^{+} over dry regions occurs consistently across the globe, particularly in the summer hemisphere; the same pattern was evident in Fig. 2 for GOES-West, which observes the arid U.S. Southwest. In terms of the temporal evolution of the ^{+}, the large-scale spatial patterns are consistent between the three months plotted in Fig. 5, although the gradual evolution of the ^{+} estimates is evident.

### c. Evaluation against independent T_{skin} observations

Figures 6 and 7 demonstrate that the two-stage observation bias and state estimation filter improved the modeled *T*_{surf} subseasonal-scale variability, compared to independent observations, albeit by a modest amount. In Fig. 6, the mean ubRMSD of the assimilation estimates versus SURFRAD observations is reduced at each time of day by 0.05–0.31 K (~5%–10%), with the greatest improvements (>0.2 K) occurring during the first half of the day (0900–1500 UTC). The ubRMSD across all times of day is significantly reduced (at a 5% level) from 2.1 to 1.9 K.

Similar results were obtained by comparison to *Terra* and *Aqua* MODIS *T*_{skin} observations over the Americas, as shown in Fig. 7. During the night, the open-loop ubRMSD was already very small, with a spatial mean of 1.9 K for both *Terra* and *Aqua*. During the day, the open-loop ubRMSD was much larger, except over the Amazon, with a spatial mean of 3.6 K for both *Terra* and *Aqua*. For all MODIS overpasses, the assimilation consistently improved the model fit to the MODIS data across the domain, except over the Amazon, where the open-loop ubRMSD was already very low and the improvement from the assimilation was weaker, and even slightly negative in places. While the consistency of the positive improvements in Fig. 7 is encouraging, these improvements were significant (at the 5% level) over only a small fraction (<10%) of the domain. For each MODIS instrument and orbit direction, the spatial mean ubRMSD across the domain is shown in Table 2, and in each case the assimilation has reduced the spatial mean ubRMSD by around 0.2–0.3 K, or approximately 10% of the open-loop values.

While the above evaluation consistently indicates that the *T*_{skin} assimilation has improved the model *T*_{surf}, the improvements are rather modest. This is despite the use of only assimilation update times in the evaluation, which will have exaggerated the assimilation impact. There are several reasons for the modest results. Most importantly, the skill of the model *T*_{surf}, in terms of the anomaly behavior assessed here, is already very good. Also, the Catchment model *T*_{surf} has an extremely short memory, associated with its very low heat capacity; hence, the analysis updates do not persist into the subsequent model time step, and the model has very little memory of improvements previously gained from the assimilation. Including GHT1 in the state update vector did not increase the *T*_{surf} memory of previous analysis updates, since the *T*_{surf} dynamics are dominated by the radiation budget. Finally, the lack of memory is compounded by the low data volume associated with the lack of TIR observations under cloudy conditions. The modest impact of the assimilation is not related to the observation bias correction method, since similar results were obtained using cumulative distribution functions (CDFs; Reichle and Koster 2004) to rescale the observations (not shown).

## 5. Summary and conclusions

A two-stage observation bias and state estimation scheme has been developed for use in land data assimilation. In this scheme, the lumped observation bias minus forecast bias is estimated and removed from the observations prior to updating the model state. In applications where the model predictions are bias-free, this two-stage filter could also be used to correct the observations toward the true mean state. The presented method is computationally affordable, straightforward to implement in an existing assimilation, requires specification of only a single additional parameter, and can be used to assimilate satellite radiances or retrieved geophysical variables. In contrast to the observation rescaling methods currently used in land data assimilation systems, the two-stage filter does not require a long data record. Hence, it has the potential to facilitate the use and success of land data assimilation, particularly in atmospheric modeling systems for which long records of consistently forecast land surface estimates are typically not available.

The two-stage filter includes a parameterization of the Kalman gain for the bias update that introduces an explicit specification of the time scale of the estimated bias. Defining the bias over seasonal time scales allows the assimilation to update the model state vector in response to subseasonal-scale (e.g., synoptic scale and mesoscale) differences between observed and forecast estimates.

In experiments assimilating geostationary *T*_{skin} observations into the Catchment land surface model, the two-stage filter effectively removed the *O* − *F* mean difference from the observations and consequently improved subseasonal-scale dynamics in the model *T*_{surf} (the model equivalent variable to *T*_{skin}). These improvements were measured using the ubRMSD with independent estimates of *T*_{skin} from the SURFRAD network (at six sites in the United States) and from MODIS satellite observations over the Americas. While modest, the improvements highlight the potential value of the geostationary *T*_{skin} for future modeling efforts.

Global maps of the *O* − *F* bias estimated by the two-stage filter show clear spatial coherence, with a signal of local land surface conditions. Most prominently, there is a strong tendency for large positive *O* − *F* biases in dry regions during the daytime. In this study, the bias was estimated independently at each model grid cell. However, the spatial cohesion of the estimates suggests the potential to improve the two-stage filter design by incorporating horizontal information into the observation bias estimates. This could be achieved either by including spatial smoothing in the bias forecast model (assuming correlations between the biases in adjacent areas) or by implementing the bias update using a 3D filter (assuming correlations between the errors in the bias estimates).

In addition to the difficulty of obtaining suitable data records for observation rescaling, several studies have highlighted other shortcomings arising from the stationary nature of the observation rescaling approaches for bias correction. For example, the inability of a stationary approach (CDF matching) to distinguish between near-surface soil moisture variability over seasonal and subseasonal time scales can result in inadequate matching of the seasonal cycles between forecast estimates and CDF-matched observations (Draper et al. 2009). Also Drusch et al. (2005) argue that uncertainty in the interannual variability of the vegetation characteristics used in both soil moisture retrieval and land surface modeling may necessitate nonstationary observation bias correction methods, based on either frequent updates of observation rescaling coefficients or the use of more sophisticated methods. More recently, Crow et al. (2011) showed that results from the assimilation of remotely sensed soil moisture into a simple water balance model were improved by using seasonally variable observation rescaling coefficients for adjusting the mean. The nonstationary nature of filtering may also have practical advantages for the estimation of *O* − *F* biases, in that the estimates can respond to step changes, caused, for example, by changes in the forecast model, remote sensor, or retrieval model. Hence, in atmospheric assimilation the ability of variational observation bias correction schemes to respond to temporal changes in the bias has proven beneficial (Auligné et al. 2007; Dee and Uppala 2009).

Unlike observation rescaling, the two-stage filter presented here does not explicitly address systematic differences between higher-order moments of the observations and the model estimates. For the *T*_{skin} assimilation experiments presented here, the two-stage filter proved sufficient. However, other land surface variables, including near-surface soil moisture, can have large systematic differences in the subseasonal-scale variability of observed and forecast estimates. Work is underway to expand the two-stage filter to also account for systematic differences in the higher-order moments, thus providing an alternative to observation rescaling for soil moisture data assimilation.

## Acknowledgments

The research was supported by the NASA Modeling, Analysis, and Prediction program, the NASA High-End Computing program, and the NASA Satellite Calibration Interconsistency program. MODIS land surface data were provided by NASA’s Earth Observing System Data and Information System, and the SURFRAD data were provided by the NOAA Earth System Research Laboratory.

### APPENDIX A

#### Derivation of _{k}

In the bias state update equation [Eq. (8)], the model state, observation bias, and observation estimates can each be partitioned into their true value, a random (zero mean) error, and for the observations, a long-term mean error (bias): , , and , where represents the random error in the superscripted variable. To derive _{k}, minimize the expected error in , , where *E*[·] is the expectation over time. Substituting Eq. (8) into and then partitioning each variable into its constituent parts gives

and

The derivative of with reference to _{k} is

Setting the derivative to zero and solving for gives the minimum:

If , , and are not cross correlated with each other, the expectation is

### APPENDIX B

#### Derivation of and Equivalence to

To derive , minimize the expected error in , . Substituting Eq. (11) into , and as in appendix A, partitioning each variable into its constituent terms, gives

and

The derivative of with reference to is

If ^{o}, ^{x−}, and ^{b−} are not cross correlated with each other, setting the derivatives to zero to minimize and taking the time expectation gives

and

which is the same as Eq. (4) for the Kalman gain for the bias-free EnKF state update. This demonstrates that the inclusion of the observation bias estimate from the two-stage state and bias estimation does not change the expression of the solution for the Kalman gain for the state update in Eq. (10) [assuming that Eq. (9) is used for _{k}].

## REFERENCES

*Water Resour. Res.,*

**47,**W08521, doi:.

*J. Geophys. Res.,*

**114,**D20104, doi:.

*Geophys. Res. Lett.,*

**39,**L04401, doi:.

*Geophys. Res. Lett.,*

**32,**L15403, doi:.

*J. Geophys. Res.,*

**115,**D19112, doi:.

*Geophys. Res. Lett.,*

**31,**L19501, doi:.

*J. Geophys. Res.,*

**112,**D09108, doi:.

*Geophys. Res. Lett.,*

**35,**L22708, doi:.

*J. Geophys. Res.,*

**110,**D11109, doi:.