## 1. Introduction

Sequential ocean data assimilation combines a forecast produced by a general ocean circulation model with observations to construct an improved estimate of the state of the ocean. The optimality of such a procedure depends on the validity of statistical assumptions about the errors of the observations and the forecasts. One critical assumption is that the error of the forecast is random, with no systematic errors. Unfortunately, studies reveal that current ocean models do have large systematic errors. These errors, which we refer to collectively as bias, are introduced through a variety of mechanisms including inaccurate initial conditions, systematic errors in surface forcing, parameterizations of unresolved physics, or numerics, or through nonlinear processes when forced by random errors. In this paper we examine the structure of bias in a current assimilation system and explore a method designed to account for it during the data assimilation cycle.

*ω**(consisting of the gridpoint values of temperature, salinity, velocity, and pressure), the unknown true state*

^{f}

*ω**, and the random component of the forecast error*

^{t}

*ε**, that is, the part of the forecast that has zero expectation:*

^{t}Thus the true bias may be determined by taking the expectation of the difference between the state forecast and the true state. If the expectation operator is approximated by the time mean or annual harmonic operators we obtain approximate estimates of the time mean or annual cycle of the bias, and so forth.

State forecasts are produced using numerical ocean general circulation models. Thus, we can anticipate the characteristics of bias by reviewing ocean model simulations. Among the recent studies, Gent et al. (1998) document problems with time-mean deep and intermediate water mass formation rates as well as excessive diffusion of the thermocline, a problem that seems characteristic of *z-*coordinate models. Smith et al. (2000) show that despite very high 1/10° × 1/10° horizontal resolution, some differences in the variability of the western boundary currents and their mean position remain. The results of Sun and Bleck (2001) indicate that problems with water mass formation/penetration remain despite use of isopycnic vertical coordinates. Seasonality in the bias is evident in the study of Tandon and Zahariev (2001), wherein it appears to be caused by unresolved diurnal variations in surface forcing. In the tropical Pacific, sea surface temperatures in the eastern basin are generally too low and the thermocline is too high (Stockdale et al. 1998).

The problem of model bias has been widely recognized in the meteorological community, and before that in the engineering literature. As a result, a number of approaches have been developed. The most straightforward approaches to handling time-mean bias involve computing model or analysis climatologies and then introducing correction terms into the equations of motion (e.g., Saha 1992). To correct for rapidly changing bias in a data-rich environment, a second class of approaches has been proposed involving examination of the past few updating cycles for a systematic tendency, which is then corrected (Thiebaux and Morone 1990; DelSole and Hou 1999; D’Andrea and Vautard 2000; Griffith and Nichols 1996, 2000).

This second class of approaches has the advantage that they allow the bias estimate to evolve in time as would be expected, for example, if the bias is being advected by the time-changing large-scale flow field. These approaches have the disadvantage that they do not retain long-term memory and thus cannot fully account for predictable biases with longer time scales such as those linked to the annual cycle. A third class of approaches useful in linear one-dimensional problems involves prewhitening the errors so that their frequency spectrum resembles white noise (Kamen and Su 1999).

In this paper we explore a fourth class of approaches we refer to as “two-stage estimation,” which was first introduced by Friedland (1969); see also Mendel 1976; Ignagni 1981). The two-stage estimation algorithm begins with the assumption that a reasonable estimate of the bias may be made prior to estimating the state of the system itself, thus allowing the estimation procedures for the bias and the state to be carried out successively. In its original form this bias was assumed to be steady. Ignagni (1990) expanded the bias model to allow for time-dependent bias (a problem considered earlier by Thacker and Lee 1972*)*. Other authors (Mendel 1976; Zhou et al. 1993) have proposed modifications to the two-stage estimation algorithm designed to handle nonlinear state models. Application to atmospheric data assimilation has been explored by Dee and da Silva (1998) and Dee and Todling (2000).

Here we apply the two-stage estimation algorithm of Dee and da Silva (1998) and Dee and Todling (2000) to the problem of sequential data assimilation in the ocean. We focus on the tropical Pacific sector (30°S–30°N) because of the strength of its interannual variability and because of its importance in climate research. Previous efforts to use the Dee and da Silva algorithm for the ocean (e.g., Carton et al. 2000a, b; Martin et al. 2002) have assumed a very simple forecast model for the bias. But as suggested above, systemic errors in ocean models are actually geographically oriented, with temporally varying structure resulting from errors in identifiable phenomena such as thermocline water mass formation or mixed-layer entrainment. Some errors are persistent, others are cyclic, and still others are noncyclic but predictable, and thus may be corrected in a forecast system. Our approach will need to account for each of these. Dee and da Silva have also suggested a simple offline approach in which the bias estimates do not affect the model forecast, an approach that we will explore.

## 2. Bias correction algorithm

*ω**at time*

^{f}*t*as a column vector containing the

_{k}*N*state variables. It is produced by forward integration of the ocean model

**Ω**beginning from the analysis at time

*t*

_{k}_{−1}:The bias forecast estimate is similarly the sum of the true bias and a zero expectation random error

*η**, both column vectors of length*

^{f}*N*:(we drop the subscript

*k*throughout except where it is needed for clarity), and it is produced by the (linear) bias model

**B**beginning from the bias analysis at time

*t*

_{k}_{−1}. A general linear model for the bias would look likewhere Dee and da Silva (1998) assume

**B**is diagonal and

**Z**

*= 0. In this paper we assume*

_{k}**B**= 0 and explore simple models of

**Z**including steady, periodic, and models based on empirical orthogonal eigenfunction analysis of forecast-minus-observation statistics.

*t*we have

_{k}*M*unbiased observations contained in a column vector

*ω**. This column vector may be decomposed into the true values interpolated to the observation locations plus an unbiased error*

^{o}

*ε**associated with each observationHere*

^{o}**H**is the

*M*×

*N*interpolation matrix that maps variables specified at the model locations onto the observation locations. We would like to provide an analysis of the state as well as the bias based on a linear combination of the same set of

*ω**and an unbiased forecast*

^{o}**=**

*ω̃*

*ω**−*

^{f}

*β**. Dee and da Silva (1998) and Dee and Todling (2000) propose to compute the analyses in two stages:using successively improved estimates of the unbiased state forecast. Thus,*

^{a}

*ω**is an unbiased estimate of*

^{a}

*ω**. Here*

^{t}**K**and

**L**are

*N*×

*M gain*matrices that account for the relative accuracies of the model forecast and the observations. Minimization of the mean square errors of the state and bias analyses

*under the assumptions that the observation error, state forecast error, and bias forecast error are mutually independent*(e.g., 〈

*ε**(*

^{f}

*η**)*

^{f}^{T}〉 = 〈

*η**(*

^{f}

*ε**)*

^{f}^{T}〉 = 0) leads to the following equations for the gain matrices (Dee and da Silva 1998):The observation error covariance is

**R**≡ 〈

*ε**(*

^{o}

*ε**)*

^{o}^{T}〉 .

**P**

*≡ 〈*

^{f}_{β}

*η**(*

^{o}

*η**)*

^{o}^{T}〉 and the unbiased forecast error covariance

**P**

*≡ 〈*

^{f}_{ω}

*ε**(*

^{f}

*ε**)*

^{f}^{T}〉. We assume that the bias error covariance has horizontal scales similar to those of the basin and may be geographically oriented, while the unbiased forecast error has horizontal scales of a few hundred kilometers [e.g., see the analysis of Carton et al. (2000b)] and is roughly homogeneous. Following Dee and Todling (2000) the bias-corrected observation-minus-forecast differences arewhere we introduce

**v̂**and

**v**′ to represent the basin- and small-scale components of the observation-minus-forecast differences. If we neglect the covariance between large and small scales of observed-minus-forecast differences (〈

**v̂v**′

^{T}〉 = 〈

**v**′

**v̂**

^{T}〉 = 0), then the bias-corrected observation-minus-forecast covariance is given byAs a result of the assumption of the independence of basin and small scales we can separate (9) into two relations:Equations (10) allow us to calculate

**P**

*and*

^{f}_{ω}**P**

*. Based on 30 yr of forecast-minus-observation differences we calculate the random forecast error covariance*

^{f}_{β}**P**

*by fitting a model that decays exponentially as a function of separation in latitude, longitude, and time, and applying minimum variance estimation [see Carton et al. (2000b) for details]. The resulting spatial scales of exponential decay are smaller than 500 km, and the time scale is less than 30 days. We assume that all covariance at scales greater than 500 km and 30 days is due to bias.*

^{f}_{ω}We estimate the covariance **P** * ^{f}_{ω}* of the bias forecast error by first binning the forecast-minus-observation differences into 5° × 5° bins (thus filtering out the random forecast error). Then we compute

**P**

*in this reduced space. In fact, it will be shown in the following section that*

^{f}_{ω}**P**

*is dominated by a few basin-scale structures and thus we will approximate it by a small number of principal components, every one of which has a time scale more than few months. Additional experiments not reported here show the results to be insensitive to the precise specification of the covariances in (10).*

^{f}_{ω}**P**

*is dominated by a few basin-scale structures we assume that the bias forecast model and bias analysis can be represented by the product of a truncated set of empirical orthogonal functions (EOFs)*

^{f}_{ω}**G**and principal components

**:where**

*τ***G**is a matrix of size

*N*×

*Q*containing the first

*Q*EOFs representing the spatial structures and

*τ**is a vector of size Q containing the principal component coefficients at time*

^{f}*t*.

_{k}**G**are orthonormal we can compute the bias analysis time series

*τ**by multiplying both sides of (6a) by*

^{a}**G**

^{T}(dropping the time subscript

*k*as usual for convenience):and thenwhereThe Eqs. (2), (4), (6), and (7) together with the specification of the bias model (11)–(13) and truncated of

**G**represent our complete set of assimilation equations. To develop a reasonable bias model we now begin examination of the bias in a current assimilation scheme.

## 3. Bias

Here we examine the bias as it appears in a current ocean data assimilation system, described below. Our analysis procedure is directed toward identifying features in the bias field that have broad spatial scales and long temporal scales because of their relevance for climate estimates and because these features are more statistically stable. As a result of this examination we propose a model of bias in section 4.

The data assimilation analyses rely on the Simple Ocean Data Assimilation methodology of Carton et al. (2000a, b)), which uses a forward model using Geophysical Fluid Dynamics Modular Ocean Model IIb (MOM-2) numerics with 1° × 1/2° × 20-level resolution near the equator expanding to 1° × 1° resolution in midlatitudes. All experiments span the 31-yr period 1970–2000 with initial conditions provided by climatological temperature and salinity. Our analysis is limited to the last 30 yr to reduce the impact of the initial conditions. Sponge layers are inserted poleward of 62°S and 62°N that relax temperature and salinity to their climatological monthly values. Wind stress is provided by the monthly observation-based analysis of da Silva et al. (1994) for the years prior to 1991. Winds for the period 1991–2001 are provided by a combination of the National Centers for Environmental Prediction monthly anomalies added to the seasonal cycle provided from the da Silva winds in order to minimize the shock introduced by the change in wind analyses. Surface temperature is relaxed to the monthly estimates of Smith and Reynolds (2003), while sea surface salinity is relaxed to the climatological monthly estimates of Levitus et al. (1994).

The basic subsurface dataset in the tropical Pacific consists of approximately 1.6 × 10^{6} temperature profiles, of which two-thirds were obtained from the World Ocean Database 2001 (Boyer et al. 2002; Stephens et al. 2002) and extended by operational temperature profile observations from the National Oceanic and Atmospheric Administration (NOAA)/National Oceanographic Data Center temperature archive and including observations from the Tropical Atmosphere Ocean (TAO)/Triton mooring thermistor array. The profile data are concentrated along commercial shipping lanes in the far eastern and western basins. SST observations were obtained from the COADS surface marine observation set.

The analysis procedure solves Eqs. (6)–(7) at 10-day intervals. The time-updating algorithm of Bloom et al. (1996) has been used to suppress spurious gravity waves. A set of five experiments is carried out listed in Table 1, differing only in the bias forecast model (4). We first discuss results from the *control experiment* (expt 1) in which there is no correction for bias. This experiment does not account for basin-scale errors. Although all state variables are available, we focus our discussion on the 30 × 36 = 1080 10-day fields of analysis and forecast temperature. Bias is determined by looking for systematic components of the objectively grided differences between the model forecast and the observations. Our analysis focuses on mixed-layer temperature because of its importance in influencing the atmosphere and the depth and width of the thermocline because they reflect the accuracy with which the oceanic heat storage is represented.

We begin by examining the time-mean bias. The time-mean component dominates the bias in the mixed layer (0–45 m). It explains 44% of the variance averaged geographically and in time within the mixed layer. Below the mixed layer the variance explained by the time-mean component decreases with increasing depth until the depth of the thermocline where a second maximum occurs. Along the equator the forecast mixed layer is too cold in the east by up to 0.2°C (Fig. 1 top). The thermocline below this is too sharp, as indicated by the warm bias in the upper thermocline. In contrast, in the central basin the upper thermocline has a cold bias indicating that the thermocline is too shallow and broad.

It is evident in Fig. 1 that the bias has different behavior in the mixed layer and the thermocline. Because of this difference we will carry out separate analyses of these, treating the mixed layer as a slab of uniform 45-m depth. The thermocline depth varies much more widely throughout the basin. Here we approximate the thermocline depth as the depth of the 20°C isotherm. We define the width of the thermocline to be the difference in the depth of the 22° and 14°C isotherms. It is evident from the discussion above that bias includes errors in the width as well as depth of the thermocline.

The geographic distribution of the time-mean bias is shown in Fig. 2. Time-mean bias in the mixed layer is mainly confined to equatorial latitudes and is nearly symmetric about the equator, with a maximum negative anomaly of −0.2°C. In contrast to the mixed layer, the time-mean bias in thermocline depth extends broadly into the Southern Hemisphere, indicating that the thermocline is 2 m too deep in the east and 4 m too shallow in the central basin (again, evaluated over a 10-day update cycle), while the thermocline is 10–20 m too wide throughout the western Tropics (±10°).

Examination of the evolving bias reveals that there is a significant annual component as well. This is particularly evident in the Northern Hemisphere mixed layer between 10° and 30°N where the monthly forecast-minus-observation differences are strongly negative in June (−0.3°C) and strongly positive in December (0.4°C) even though the annual mean bias averaged over this band of latitudes is small (<|0.1°C|; Fig. 3). Interestingly, the distributions of forecast-minus-observation differences are also skewed, with the skewness also changing sign seasonally from –1.3 to 1.8 indicating a larger negative tail in June and a larger positive tail in December. The difference distributions also have larger tails in both directions than would be expected for a Gaussian distribution (kurtosis is 8 and 11, respectively).

We evaluate the spatial structure of the annual component by computing the annual Fourier harmonic of the forecast-minus-observation differences binned in 5° × 5° bins (binning is required to compute statistics), evaluated over the 30-yr record. The amplitude and phase diagrams (Fig. 4) reveal that in the northern subtropics the maximum cold bias of up to 0.1°C occurs in spring and a corresponding warm bias occurs in fall, a time of year when the mixed layer reaches its warmest (the annual cycle of bias is roughly 25% of the annual cycle of mixed-layer temperature). Along the equator the reduction in the annual component of the bias reflects a reduction in the annual cycle of mixed-layer temperature (the annual cycle of mixed-layer temperature along the equator is less than 0.05°C west of the date line).

The amplitude of the annual component of bias decreases with increasing depth and with decreasing latitude (partly reflecting the decrease with latitude of the annual harmonic of SST itself). In the latitudes of the North Equatorial Countercurrent (5°–15°N) the phase of the annual component of bias changes, giving a warm bias in spring when latent heat loss associated with intensification of the trade winds should be causing mixed-layer temperatures to drop to their annual minimum.

In contrast to the mixed layer, the annual component of bias in thermocline depth is fairly uniform with latitude, ranging from 1 to 4 m, with somewhat higher values in the subtropics. Along the equator where the annual component has a weak local maximum, there is evidence that the annual bias propagates slowly eastward in the western basin (propagation is evident in the alternating pattern of phase shift in Fig. 4 along the equator). The annual component of thermocline width bias is greater than 5 m throughout the western basin as well as in the subtropics. The phase of thermocline width bias is quite variable.

In addition to a time-mean cold bias and annual variations, bias at the equator also contains year-to-year variations both within the mixed layer and at thermocline depths (Fig. 5). Within the mixed layer the forecasts have a geographically stationary cold bias during the El Niño years, a result that explains much of the time-mean component. Within the thermocline, warm and cold bias anomalies propagate slowly eastward in a way reminiscent of ENSO-induced thermocline anomalies. The anomalies indicate that the forecast thermocline underestimates the thermocline anomalies associated with both El Niño and La Niña.

To characterize the spatial structure of the interannual variability we conduct a principal component analysis of the three-dimensional forecast-minus-observation differences. The two principal components (Fig. 6), which explain roughly 30% of the variance of the bias anomalies (after mean and annual signals have been removed), are primarily confined to the equatorial zone. Their spatial patterns are rather different from the corresponding principal components of the system state (see Chao and Philander 1993). However, their time series, as well as the principal component time series of the system state are closely related to the Southern Oscillation index (SOI). Indeed, the first principal component has a zero lag correlation with the SOI of 0.7. The second lags the SOI by up to one year. Together they describe the eastward propagation of bias evident in Fig. 5. The higher principal components are noisy and difficult to interpret. We suspect that much of the variance described by these higher principal components results from sampling error introduced by the sparse observing system.

## 4. Bias modeling

*ω*

*= 2*

_{A}*π*(1 yr)

^{−1}, and

*G*

_{i}is

*i*th EOF. We will explore this model through a series of assimilation experiments listed in Table 1.

To examine the successive impact of the components of the bias model (14) we present three additional experiments. Experiment 2 retains only the time-mean term in the bias model, expt 3 retains both the time mean and the annual terms, while expt 4 retains all three terms (again, see Table 1).

We begin with expt 2, in which the bias model includes only the mean bias *β*

The middle panel in Fig. 1 shows the time-mean *uncorrected* forecast-minus-observation differences (the forecast before bias correction) along the equator for expt 2. It is interesting to note that the time-mean bias is not significantly reduced in comparison with the control experiment even though the initial conditions for the forecasts have been improved (cf. Fig. 1upper and middle panel). The implication from this finding is that the forecasts have some error growth that occurs very rapidly, in less than the 10-day interval between successive forecasts.

Improving the time-mean component of the bias does not significantly improve the time-varying components of the bias. Thus in order to improve the forecasts still further we consider expt 3 in which both the mean and annual cycle of the bias are corrected. The reduction in the monthly mean bias is dramatically illustrated by the December and June averages of forecast-minus-observation differences in the latitude band 10°–30°N where the monthly mean bias is reduced by a factor of 4–5 (Fig. 3). Interestingly, the skewness of the histograms is also reduced by a factor of 2.

Figure 7 shows that the annual signal in the mixed-layer bias is reduced to under 0.05°C almost everywhere (cf. Figs. 4 and 7). The bias in thermocline depth and width is also reduced by a factor of 2–3. Interestingly, the phase of the annual cycle of thermocline bias for expt 3 resembles the phase of the control experiment, indicating that we have undercorrected the bias somewhat.

In the fourth experiment we introduce interannual variability into our model of bias. We examine the results by comparing principal components of the observation-minus-forecast differences for the control and expt 4 (cf. Figs. 6 and 8). As in the control experiment, the thermocline expressions of the first two principal components for expt 4 have their maximum amplitude within ±10°° latitude, while the corresponding time series show that the forecast-minus-observation differences of the third and fourth principal components have been reduced by more than a factor of 2, and thus the errors associated with the historical reproduction of ENSO events have been similarly reduced.

The reductions in bias may be compared by examining the root-mean-square (rms) forecast-minus-observation differences for each bias model (Fig. 9). In the control experiment where the bias is neglected the rms temperature differences are more than 0.5°C in the eastern equatorial mixed layer as well as in the subtropics between successive 10-day updates. Rms thermocline depth differences are also maximum on the equator, reaching values of 4 m between successive 10-day updates.

Experiment 2, in which the time-mean bias is corrected, reduces rms mixed-layer temperature differences by a factor of 2 along the equator, with somewhat smaller reductions at higher latitude. Similar reductions are evident in rms thermocline depth differences. Reducing the annual cycle of bias (expt 3) reduces mixed-layer temperature differences in the subtropics where the annual cycle is large. In contrast, the improvement in the thermocline depth differences with this bias forecast model is limited.

The final bias forecast model we consider, used in expt 4, includes the two interannually varying unrotated principal components that primarily describe ENSO-related differences. Figure 9 shows that the inclusion of interannual variation in the bias forecast model has little impact on the rms forecast-minus-observation differences in the mixed layer, while it reduces differences in thermocline depth by 30%. An additional experiment not listed in Table 1 shows these results to be insensitive to an increase in the number of principal components used.

Finally, we examine the impact of the bias analysis by comparing expt 4 with a fifth experiment in which the bias analysis *β** ^{a}* is equal to the bias forecast for expt 4 [in other words, rather than updating the bias estimates during the assimilation according to (6a), we simply use the bias forecast given by the model (14) fitted to the forecast-minus-observation differences of the control run]. This corresponds to the use of fixed, or offline, bias estimates for correcting the bias during assimilation. The results of this experiment (Fig. 10) show that this procedure is not as effective in reducing the rms forecast-minus-observation differences as the full two-stage procedure.

## 5. Discussion

In this paper we examine the effects of bias in the forecast model on the error introduced in a 31-yr-long historical analysis of the physical state of the ocean, focusing on the mixed layer and thermocline depth in the tropical Pacific Ocean. We find that the biases are large and contain a variety of space and time scales including a time mean, an annual cycle, and interannual variability linked to ENSO.

In the eastern equatorial mixed layer between successive 10-day updates root-mean-square temperature differences are more than 0.5°C. If these differences are interpreted as a heat flux error it would require a nonphysically large change in surface flux of approximately 100 W m^{−2} to correct. Rms thermocline depth differences are also maximum on the equator, but largest in the western basin, reaching rms values of 8 m between successive 10-day updates. Rms thermocline width differences are over twice as large as this.

To reduce the impact of bias on historical analyses we introduce a two-stage bias correction algorithm based on the ideas of Dee and da Silva (1998), modified to account for the presence of geographically oriented model bias. The efficacy of this procedure is explored in a set of three additional experiments examining successively more complete empirical bias forecast models. We focus on the degree to which the bias-corrected forecast-minus-observation differences are reduced, a key measure to the improvement of the analysis.

The first bias forecast model we explore (expt 2) assumes a steady but spatially varying bias. The use of this model reduces mixed-layer forecast-minus-observation differences in mixed-layer temperature by a factor of 2 along the equator with somewhat smaller reductions at higher latitude. Similar reductions are evident in thermocline depth differences. Interestingly, the differences between the uncorrected forecast and observations are not strongly reduced, indicating that much of the bias develops during the first few days.

We next consider a bias forecast model that additionally includes an annual cycle (expt 3). The largest reductions in forecast-minus-observation differences are in the subtropical mixed layer where the annual cycle is large. In contrast, the improvement in thermocline depth differences is limited. The final bias forecast model we consider (expt 4) also includes the next two unrotated principal components of the forecast-minus-observation differences after the time mean and annual cycle. Examination of these principal components shows that they primarily represent bias associated with the forecast model’s representation of ENSO. We find that relative to the results from expt 3, there is little reduction in the differences in the mixed layer, but a 30% reduction in the differences in thermocline depth because of an improvement in the representation of interannual variability of the thermocline depth. The cumulative effect of our most sophisticated bias model is to reduce the corrected forecast-minus-observation differences by a factor of 2 over the control experiment.

To illustrate the impact of bias correction on the state analysis we compare the state analysis from expt 4 and the control experiment with independent velocity measurements. Zonal velocity on the equator at 110°W in the eastern basin in the control run shows much too strong westward near-surface currents and an Equatorial Undercurrent that is much too weak at 80-m depth (Fig. 11). As a result of the introduction of the two-stage bias correction procedure, expt 4 shows much weaker (and thus more realistic) surface currents and a substantially stronger Equatorial Undercurrent with a realistic relaxation in the boreal summer of 1997.

The results of this study indicate that the forecast-minus-observation differences are significantly biased, and that this bias reduces the effectiveness of ocean applications of data assimilation. In regions of high variability, bias may explain up to half of the forecast-minus-observation differences. The two-stage correction algorithm explored here is successful in correcting much of this difference despite the simplicity of the bias forecast models considered, and these results are insensitive to the choices of assimilation parameters. The results apply directly to assimilation schemes relying on optimal interpolation or three-dimensional variational approaches. Further improvements through the use of dynamically based bias forecast models [the next step in this direction may be to take into account both terms in Eq. (4)] as well as reduction in the bias of the forecasts due to improvements in the models will help to reduce this problem in the future.

## Acknowledgments

We are grateful to the National Science Foundation Information Technology Research program (OCE 0113148) for providing support.

## REFERENCES

Bloom, S. C., , L. L. Takacs, , A. M. da Silva, , and D. Ledvina, 1996: Data assimilation using incremental analysis updates.

,*Mon. Wea. Rev.***124****,**1256–1271.Boyer, T. P., , C. Stephens, , J. I. Antonov, , M. E. Conkright, , L. A. Locarnini, , T. D. O’Brien, , and H. E. Garcia, 2002:

*Salinity*. Vol. 2,*World Ocean Atlas 2001,*NOAA Atlas NESDIS 49, 165 pp.Carton, J. A., , G. Chepurin, , and X. Cao, 2000a: A simple ocean data assimilation analysis of the global upper ocean 1950–95. Part II: Results.

,*J. Phys. Oceanogr.***30****,**311–326.Carton, J. A., , G. Chepurin, , X. Cao, , and B. Giese, 2000b: A simple ocean data assimilation analysis of the global upper ocean 1950–95. Part I: Methodology.

,*J. Phys. Oceanogr.***30****,**294–309.Chao, Y., , and S. G. H. Philander, 1993: On the structure of the Southern Oscillation.

,*J. Climate***6****,**450–469.D’Andrea, F., , and R. Vautard, 2000: Reducing systematic errors by empirically correcting model errors.

,*Tellus***52A****,**21–41.da Silva, A. M., , C. C. Young, , and S. Levitus, 1994:

*Algorithm and Procedures.*Vol. 1,*Atlas of Surface Marine Data 1994*, NOAA Atlas NESDIS 6, 83 pp.Dee, D. P., , and R. Todling, 2000: Data assimilation in the presence of forecast bias: The GEOS moisture analysis.

,*Mon. Wea. Rev.***128****,**3268–3282.Dee, D. P., , and A. M. da Silva, 1998: Data assimilation in the presence of forecast bias.

,*Quart. J. Roy. Meteor. Soc.***124****,**269–295.DelSole, T., , and A. Y. Hou, 1999: Empirical correction of a dynamical model. Part I: Fundamental issues.

,*Mon. Wea. Rev.***127****,**2533–2545.Friedland, B., 1969: Treatment of bias in recursive filtering.

,*IEEE Trans. Autom. Control***AC-14****,**359–367.Gent, P. R., , F. O. Bryan, , G. Danabasoglu, , S. C. Doney, , W. R. Holland, , W. G. Large, , and J. C. McWilliams, 1998: The NCAR Climate System Model global ocean component.

,*J. Climate***11****,**1287–1306.Griffith, A. K., , and N. K. Nichols, 1996: Accounting for model error in data assimilation using adjoint methods. Computational differentiation: Techniques, applications, and tools.

*Proc. Second Int. SIAM Workshop on Computational Differentiation,*Santa Fe, NM, Society for Industrial and Applied Mathematics, 195–204.Griffith, A. K., , and N. K. Nichols, 2000: Adjoint methods in data assimilation for estimating model error.

,*Flow Turbul. Combust.***65****,**469–488.Ignagni, M., 1981: An alternate derivation and extension of Friedland’s two-stage Kalman estimator.

,*IEEE Trans. Autom. Control***AC-26****,**746–750.Ignagni, M., 1990: Separate-bias Kalman estimator with bias state noise.

,*IEEE Trans. Autom. Control***AC-35****,**338–341.Kamen, E. W., , and J. K. Su, 1999:

*Introduction to Optimal Estimation*. Springer, 380 pp.Levitus, S., , R. Burgett, , and T. Boyer, 1994:

*Salinity.*Vol. 3,*World Ocean Atlas 1994*, NOAA Atlas NESDIS 3, 99 pp.Martin, M. J., , M. J. Bell, , and N. K. Nichols, 2002: Estimation of systematic error in an equatorial ocean model using data assimilation.

,*Int. J. Numer. Methods Fluids***40****,**435–444.Mendel, J. M., 1976: Extension of Friedland’s bias filtering technique to a class of non-linear systems.

,*IEEE Trans. Autom. Control***AC-21****,**296–298.Saha, S., 1992: Response of the NMC MRF model to systematic-error correction within integration.

,*Mon. Wea. Rev.***120****,**345–360.Smith, R. D., , M. E. Maltrud, , F. O. Bryan, , and M. W. Hecht, 2000: Numerical simulation of the North Atlantic Ocean at 1/10 degrees.

,*J. Phys. Oceanogr.***30****,**1532–1561.Smith, T. M., , and R. W. Reynolds, 2003: Extended reconstruction of global sea surface temperature based on COADS data (1854–1997).

,*J. Climate***16****,**1495–1510.Stephens, C., , J. I. Antonov, , T. P. Boyer, , M. E. Conkright, , A. Locarini, , T. D. O’Brien, , and H. C. Garcia, 2002:

*Temperature.*Vol. 1,*World Ocean Atlas 2001,*NOAA Atlas NESDIS 49, 167 pp.Stockdale, T. N., , A. J. Busalacchi, , D. E. Harrison, , and R. Seager, 1998: Ocean modeling for ENSO.

,*J. Geophys. Res.***103****,**14325–14355.Sun, S., , and R. Bleck, 2001: Thermohaline circulation studies with an isopycnic coordinate ocean model.

,*J. Phys. Oceanogr.***31****,**2761–2782.Tandon, A., , and K. Zahariev, 2001: Quantifying the role of mixed layer entrainment for water mass transformation in the North Atlantic.

,*J. Phys. Oceanogr.***31****,**1120–1131.Thacker, E. C., , and C. C. Lee, 1972: Linear filtering in the presence of time-varying bias.

,*IEEE Trans. Autom. Control***AC-17****,**828–829.Thiebaux, H. J., , and L. L. Morone, 1990: Short-term systematic errors in global forecasts: Their estimation and removal.

,*Tellus***42A****,**209–229.Zhou, D. H., , Y. X. Sun, , Y. G. Xi, , and Z. J. Zhang, 1993: Extension of Friedland’s separate-bias estimation to randomly time-varying bias of nonlinear systems.

,*IEEE Trans. Autom. Control***AC-38****,**1270–1273.

Bias model experiments. Each experiment begins with the same initial conditions on 1 Jan 1970 and is carried out for 31 yr. The bias models are described in section 4.