• Alves, O., and C. Robert, 2005: Tropical Pacific Ocean model error covariances from Monte Carlo simulations. Quart. J. Roy. Meteor. Sci., 131 , 36433658.

    • Search Google Scholar
    • Export Citation
  • Bell, M. J., M. J. Martin, and N. K. Nichols, 2004: Assimilation of data into an ocean model with systematic errors near the equator. Quart. J. Roy. Meteor. Soc., 130 , 873893.

    • Search Google Scholar
    • Export Citation
  • Borovikov, A., M. M. Rienecker, C. L. Keppenne, and G. C. Johnson, 2005: Multivariate error covariance estimates by Monte Carlo simulation for assimilation studies in the Pacific Ocean. Mon. Wea. Rev., 133 , 23102334.

    • Search Google Scholar
    • Export Citation
  • Burgers, G., P. J. van Leeuwen, and G. Evensen, 1998: Analysis scheme in the Ensemble Kalman Filter. Mon. Wea. Rev., 126 , 17191724.

  • Cooper, M., and K. Haines, 1996: Altimetric assimilation with water property conservation. J. Geophys. Res., 101 , 10591077.

  • Dee, D. P., and A. K. Da Silva, 1998: Data assimilation in the presence of forecast bias. Quart. J. Roy. Meteor. Soc., 124 , 269295.

  • Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99 , 1014310162.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 2003: The ensemble Kalman Filter: Theoretical formulation and practical implementation. Ocean Dyn., 53 .doi:10.1007/210236-003-0036-9.

    • Search Google Scholar
    • Export Citation
  • Fischer, M., M. Flügel, and M. Ji, 1997: The impact of data assimilation on ENSO simulations and predictions. Mon. Wea. Rev., 125 , 819829.

    • Search Google Scholar
    • Export Citation
  • Fu, L-L., I. Fukumori, and R. N. Miller, 1993: Fitting dynamic models to the Geosat sea level observations in the tropical Pacific Ocean. Part II: A linear, wind-driven model. J. Phys. Oceanogr., 23 , 21622181.

    • Search Google Scholar
    • Export Citation
  • Fukumori, I., R. Raghunath, L-L. Fu, and Y. Chao, 1999: Assimilation of TOPEX/POSEIDON altimeter data into a global ocean circulation model: How good are the results? J. Geophys. Res., 104 , 2564725665.

    • Search Google Scholar
    • Export Citation
  • Gaspari, G., and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions. Quart. J. Roy. Meteor. Soc., 125 , 723757.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., 2001: Interpretation of rank histograms for verifying ensemble forecasts. Mon. Wea. Rev., 129 , 550560.

  • Houtekamer, P., and H. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev., 129 , 123137.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P., H. L. Mitchell, G. Pellerin, M. Buehner, M. Charron, L. Spacek, and B. Hansen, 2005: Atmospheric data assimilation with an ensemble Kalman filter: Results with real observations. Mon. Wea. Rev., 133 , 604620.

    • Search Google Scholar
    • Export Citation
  • Ingleby, B., and M. Huddleston, 2005: Quality control of ocean profiles—historical and real time data. J. Mar. Syst., in press.

  • Kalnay, E., and Coauthors, 1996: The NCEP/NCAR 40-Year Reanalysis Project. Bull. Amer. Meteor. Soc., 77 , 437471.

  • Keppenne, C. L., and M. M. Rienecker, 2002: Initial testing of a massively parallel ensemble Kalman filter with the Poseidon isopycnal ocean general circulation model. Mon. Wea. Rev., 130 , 29512965.

    • Search Google Scholar
    • Export Citation
  • Keppenne, C. L., and M. M. Rienecker, 2003: Assimilation of temperature into an isopycnal ocean general circulation model using a parallel ensemble Kalman filter. J. Mar. Syst., 40–41 , 363380.

    • Search Google Scholar
    • Export Citation
  • Keppenne, C. L., M. M. Rienecker, N. P. Kurkowski, and D. A. Adamec, 2005: Ensemble Kalman filter assimilation of temperature and altimeter data with bias correction and application to seasonal prediction. Nonlinear Processes Geophys., 14 , 113.

    • Search Google Scholar
    • Export Citation
  • Leeuwenburgh, O., 2005: Assimilation of along-track altimeter data in the Tropical Pacific region of a global OGCM ensemble. Quart. J. Roy. Meteor. Soc., 131 , 24552472.

    • Search Google Scholar
    • Export Citation
  • Le Traon, P. Y., F. Nadal, and N. Ducet, 1998: An improved mapping method of multisatellite altimeter data. J. Atmos. Oceanic Technol., 15 , 522533.

    • Search Google Scholar
    • Export Citation
  • Marsland, S., H. Haak, J. H. Jungclaus, M. Latif, and F. Röske, 2003: The Max-Planck-Institute global ocean/sea ice model with orthogonal curvilinear coordinates. Ocean Modell., 5 , 91127.

    • Search Google Scholar
    • Export Citation
  • Stammer, D., C. Wunsch, and R. Ponte, 2000: De-aliasing of global high-frequency barotropic motions in altimeter observations. Geophys. Res. Lett., 27 , 11751178.

    • Search Google Scholar
    • Export Citation
  • Steele, M., R. Morley, and W. Ermold, 2001: PHC: A global ocean hydrography with a high-quality Arctic Ocean. J. Climate, 14 , 20792087.

    • Search Google Scholar
    • Export Citation
  • Uppala, S. M., and Coauthors, 2005: The ERA-40 re-analysis. Quart. J. Roy. Meteor. Soc., 131 , 29613012.

  • Weaver, A. T., J. Vialard, and D. L. T. Anderson, 2003: Three- and four-dimensional variational assimilation with a general circulation model of the tropical Pacific Ocean. Part I: Formulation, internal diagnostics, and consistency checks. Mon. Wea. Rev., 131 , 13601378.

    • Search Google Scholar
    • Export Citation
  • View in gallery
    Fig. 1.

    Data constraint errors, calculated using Eq. (8), for (left) temperature at 50, 150, and 400 m and for (right) salinity at 50 and 150 m and sea level (SL).

  • View in gallery
    Fig. 2.

    Schematic of the (bottom) search ellipses for in situ data (large ellipse) and altimetry (small ellipse) and (top) the associated localization function applied to ensemble covariances for the analysis grid point located at 0°, 180°. The dots indicate the distribution of the TAO array buoys.

  • View in gallery
    Fig. 3.

    Time series of ensemble spread before (F) and after (A) assimilation averaged over all tropical Pacific observations of (a) temperature (°C), (b) salinity (psu), and (c) sea level (cm), respectively. Vertical lines in the diagrams indicate dates for which no observations were available (salinity) or for which the original assimilation diagnostics were missing.

  • View in gallery
    Fig. 4.

    Time series of the RMS of model–data misfits before (F) and after (A) assimilation and for the control(C), averaged over all tropical Pacific observations of (a) temperature (°C), (b) salinity (psu), and (c) sea level (cm).

  • View in gallery
    Fig. 5.

    Same as in Fig. 4, but for time series of model–data misfits.

  • View in gallery
    Fig. 6.

    Rank histograms for (left) temperature, (middle) salinity, and (right) sea level in six regions: TP (23°S–23°N, 120°–275°E), TI (23°S–23°N, 30°–110°E), TA (23S°–23°N, 290°–20°E), NP (23°–50°N, 120°–260°E), NA (23°–50°N, 290°–360°E), and SH (23°–40S, 0°–360°E). The horizontal axes represent the rank, ranging from 1 to 65, and the vertical axes represent the probability P (rank), ranging from 0 to 0.04.

  • View in gallery
    Fig. 7.

    Mean difference over the experiment period between (left) data and control and between (right) analyses and control for temperature at the 50- and 150-m depth and for salinity at the 50-m depth.

  • View in gallery
    Fig. 8.

    Time-dependent error (standard deviation) for the (left) control and (right) analyses for temperature at the 50- and 150-m depth and for sea level, calculated from Eq. (9).

  • View in gallery
    Fig. 9.

    RMS of innovations with respect to observations of (left) temperature and (right) salinity that are not used in the assimilation. Values are averages over the period 1990–94 and are shown for the Niño boxes and the following additional regions: western subtropical North Pacific: WSTNP (10°–30°N, 120°–190°E); eastern subtropical North Pacific: ESTNP (10°–30°N, 190°–260°E); western subtropical South Pacific: WSTSP (10°–30°S, 143°–200°E); eastern subtropical South Pacific: ESTSP (10°–30°S, 200°–300°E); and western extratropical North Pacific: WXTNP (30°–60°N, 120°–190°E).

  • View in gallery
    Fig. 10.

    Time series of thezonal velocity at (left) 0°, 165°E and 150-m depth and at (right) 0°, 220°E and 120-m depth from the control and analyses, compared with current meter measurements.

  • View in gallery
    Fig. 11.

    Profiles of (left two panels) velocity bias and (right two panels) RMS differences between the control and data (dotted) and between the analyses and data (solid) at 165° and 220°E on the equator.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 76 29 0
PDF Downloads 23 10 0

Validation of an EnKF System for OGCM Initialization Assimilating Temperature, Salinity, and Surface Height Measurements

Olwijn LeeuwenburghRoyal Netherlands Meteorological Institute (KNMI), De Bilt, Netherlands

Search for other papers by Olwijn Leeuwenburgh in
Current site
Google Scholar
PubMed
Close
Full access

Abstract

Results are presented from a decade-long assimilation run with a 64-member OGCM ensemble in a global configuration. The assimilation system can be used to produce ocean initial conditions for seasonal forecasts. The ensemble is constructed with the Max Planck Institute Ocean Model, where each member is forced by differently perturbed 40-yr European Centre for Medium-Range Weather Forecasts Re-Analysis atmospheric fields over sequential 10-day intervals. Along-track altimetric data from the European Remote Sensing and the Ocean Topography Experiment (TOPEX)/Poseidon satellites, as well as quality-controlled subsurface temperature and salinity profiles, are subsequently assimilated using the standard formulation of the ensemble Kalman filter. The applied forcing perturbation method and data selection and processing procedures are described, as well as a framework for the construction of appropriate data constraint error models for all three data types. The results indicate that the system is stable, does not experience a tendency toward ensemble collapse, and provides smooth analyses that are closer to withheld data than an unconstrained control run. Subsurface bias and time-dependent errors are reduced by the assimilation but not entirely removed. Time series of assimilation and ensemble statistics also indicate that the model is not very strongly constrained by the data because of an overspecification of the data errors. A comparison of equatorial zonal velocity profiles with in situ current meter data shows mixed results. A shift in the time-mean profile in the central Pacific is primarily associated with an assimilation-induced bias. The use of an adaptive bias correction scheme is suggested as a solution to this problem.

* Current affiliation: IMAU, Utrecht University, Utrecht, Netherlands

Corresponding author address: Dr. Olwijn Leeuwenburgh, IMAU, Utrecht University, P.O. Box 80005, 3508 TA Utrecht, Netherlands. Email: o.leeuwenburgh@phys.uu.nl

Abstract

Results are presented from a decade-long assimilation run with a 64-member OGCM ensemble in a global configuration. The assimilation system can be used to produce ocean initial conditions for seasonal forecasts. The ensemble is constructed with the Max Planck Institute Ocean Model, where each member is forced by differently perturbed 40-yr European Centre for Medium-Range Weather Forecasts Re-Analysis atmospheric fields over sequential 10-day intervals. Along-track altimetric data from the European Remote Sensing and the Ocean Topography Experiment (TOPEX)/Poseidon satellites, as well as quality-controlled subsurface temperature and salinity profiles, are subsequently assimilated using the standard formulation of the ensemble Kalman filter. The applied forcing perturbation method and data selection and processing procedures are described, as well as a framework for the construction of appropriate data constraint error models for all three data types. The results indicate that the system is stable, does not experience a tendency toward ensemble collapse, and provides smooth analyses that are closer to withheld data than an unconstrained control run. Subsurface bias and time-dependent errors are reduced by the assimilation but not entirely removed. Time series of assimilation and ensemble statistics also indicate that the model is not very strongly constrained by the data because of an overspecification of the data errors. A comparison of equatorial zonal velocity profiles with in situ current meter data shows mixed results. A shift in the time-mean profile in the central Pacific is primarily associated with an assimilation-induced bias. The use of an adaptive bias correction scheme is suggested as a solution to this problem.

* Current affiliation: IMAU, Utrecht University, Utrecht, Netherlands

Corresponding author address: Dr. Olwijn Leeuwenburgh, IMAU, Utrecht University, P.O. Box 80005, 3508 TA Utrecht, Netherlands. Email: o.leeuwenburgh@phys.uu.nl

1. Introduction

One of the main aims of climate centers over the previous decade has been the development of operational seasonal forecast systems, with the initial focus on improving forecasts of El Niño–Southern Oscillation (ENSO). The data assimilation schemes used in these systems have become increasingly more sophisticated. The main development over recent years has been toward multivariate methods that allow for the simultaneous correction of several state variables (rather than only the observed one) and that additionally conserve dynamical balances. A common application has been the estimation of subsurface corrections from sea level observations based on mode decomposition (Fukumori et al. 1999), raising or lowering the temperature–salinity (TS) profile (Cooper and Haines 1996), regression between sea level and EOFs of observed subsurface variability (Fischer et al. 1997), or on multivariate covariances estimated from model runs (Borovikov et al. 2005). Model error statistics and relationships between model variables are typically kept fixed over time in these applications.

In incremental four-dimensional variational data assimilation (4DVAR) methods, the propagation of the model error covariances by the linear tangent model is implicit. Error variances can be diagnosed at the start of each new window using the previous analysis increments (Weaver et al. 2003). The variational system described by Weaver et al. (2003) uses a univariate background covariance matrix, but the current development of 4DVAR systems is toward the implementation of multivariate constraints (TS relationship, geostrophy) into the error covariances.

An assimilation method that utilizes time-evolving multivariate statistics is the ensemble Kalman filter (EnKF; Evensen 1994). Keppenne and Rienecker (2003) developed an ocean data assimilation system based on the EnKF and presented initial results from the assimilation of temperature profiles. These results suggested that the EnKF can outperform a simple optimal interpolation scheme. Houtekamer et al. (2005) found that the quality of atmospheric analyses obtained with the EnKF was similar to that obtained using a 3D variational procedure. Identical-twin experiments, in which the EnKF was used to assimilate sea level in the tropical Pacific section of an OGCM ensemble, were recently presented by Leeuwenburgh (2005). It was shown that subsurface corrections could be obtained in all state variables and that the analyzed fields were closer to the truth than the unconstrained control run. In the studies of both Keppenne and Rienecker (2003) and Leeuwenburgh (2005; Figs. 3 and 4), examples were presented showing the evolution of ensemble-based error covariances over time, capturing seasonal and interannual changes in the mean state.

In this paper, an EnKF system is presented that assimilates subsurface T and S profiles as well as altimetric sea level. It was developed to produce analyzed ocean states that can be used as initial conditions for seasonal forecasts and is tested here during a 10-yr hindcast run covering the 1987–96 period. The assimilation of real data requires the construction of error models that incorporate representation errors resulting from the absence of small-scale eddy variability in the model. Such error models are constructed here for the Max Planck Institute Ocean Model (MPIOM), which is run in a global configuration with 23 vertical depth layers and increased horizontal resolution (0.5° in latitude) in the Tropics and in the high-latitude sinking regions [see Marsland et al. (2003) for more details]. Sea level is a prognostic variable in this model. Because the truth is only known approximately from the data, the quality of the analyses needs to be determined from comparisons with withheld data and from statistical tests of the consistency between model and data uncertainty estimates. Practical problems associated with the use of such a complex system are highlighted and discussed.

In section 2, the forcing perturbation method will be explained. Sections 3 and 4 are concerned with the description of data sources and processing and with the estimation of appropriate data constraint errors. The assimilation method is discussed in section 5. Results from the 10-yr assimilation run are presented in section 6, and section 7 ends with a summary and some concluding remarks.

2. Representation of uncertainty in the surface forcing

The ocean model is forced using daily fields of surface wind stress, 10-m wind speed, 2-m air temperature, dewpoint temperature, precipitation, cloud cover, and net incoming solar radiation from the 40-yr European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis (ERA-40; Uppala et al. 2005). Together with the sea surface temperature and sea ice characteristics, which are evolved by the model equations, these fields determine the surface exchange of momentum, heat, and freshwater between ocean and atmosphere, which are the main mechanisms responsible for driving the ocean circulation and setting the upper ocean stratification.

Because their exact values are not known, the forcings should be treated as stochastic variables. It is assumed that the ERA-40 estimates are unbiased but contain a certain Gaussian distributed error. It is further assumed that the statistics of differences between daily ERA-40 and National Centers for Environmental Prediction–National Center for Atmospheric Research (NCEP–NCAR) reanalysis (Kalnay et al. 1996) fields can be used as proxies for those of the true errors in ERA-40. However, because of differences in the treatment of ice-covered parts of the ocean, the difference between the reanalyses may not be an appropriate proxy for the true error in polar regions. The presence of unrealistic air temperature differences in regions that are known to contain significant amounts of sea ice during winter indicates that this is the case.

Because hydrographic observations are very sparse in the Southern Ocean south of 40°S (see next section), it is expected that they will be insufficient to constrain the ocean model state to any significant degree. Also, because at this stage seasonal forecast attempts focus primarily on the Tropics, which may be assumed to be insensitive to the exact magnitude of ocean–atmosphere exchanges at high latitudes, it was decided that an effort to represent forcing uncertainty at high latitudes should not be made. The ERA-40–NCEP differences are therefore slowly relaxed to zero within a 20° latitude band south of 40°S. Given the much larger amount of data in the Northern Hemisphere, the same procedure is applied here north of 50°N.

A combined EOF decomposition of the daily difference fields of zonal and meridional surface stress, air and dewpoint temperature, and shortwave radiation over the period 1992–93 is used to obtain the dominant spatial patterns and their time evolution. This is an extension of the approach taken by Alves and Robert (2004), who used EOF analysis to identify the leading orthogonal modes of intraseasonal ERA-40–NCEP wind stress differences only. The combined EOF decomposition produces different but temporally covarying spatial patterns for the individual variables. This approach is chosen to decrease the possibility that perturbations to the various forcing variables have opposing effects on sea surface temperature, which would reduce their effectiveness in maintaining ensemble spread. The 2-yr period was chosen because it is representative of the variability in reanalysis differences over the assimilation period and because a longer period would require too many computational resources for the EOF analysis. However, during El Niño conditions, one could imagine switching to a different perturbation set with larger (smaller) values in the western (central) part of the basin.

The spectrum of the daily differences is dominated by the seasonal cycle, representing more than 40% of the variance, and by short time scales associated with weather systems. The regular part of the seasonal cycle is perfectly predictable and therefore not of interest for ensemble forecasting. Weather systems, on the other hand, tend only to induce local changes in the ocean state (e.g., mixed layer deepening) but do not directly affect the large scales. Variability on intermediate time scales has been suggested as a possible trigger for ENSO events. The Madden–Julian oscillation, for example, has been associated with westerly wind bursts in the western tropical Pacific on 30–60-day time scales. To ensure that the EOF patterns are representative of variability on such time scales, all variability on time scales shorter than 20 days is removed from the record prior to the EOF decomposition.

The first 80 EOFs represent 99% of the total variance of the resulting normalized difference fields. The forcing perturbations are constructed as random combinations of these 80 EOFs where temporal correlation between perturbations is enforced. The temporal decorrelation scales are determined from each associated principal component time series separately.

3. Data sources and selection criteria

Satellite altimetry data from the European Remote Sensing (ERS) satellites ERS-1 and ERS-2, and Ocean Topography Experiment (TOPEX)/Poseidon missions were prepared by Collecte Localisation Satellites (CLS; Le Traon et al. 1998). Prior to assimilation, the data are further smoothed along track with a 100-km low-pass filter and subsampled at 50-km intervals. This results in approximately 7500 sea level height observations at each assimilation step.

All temperature and salinity observations are taken from the dataset prepared especially for the Enhanced Ocean Data Assimilation and Climate Prediction (ENACT) project at the Met Office (Ingleby and Huddleston 2005). The primary source for this set is the World Ocean Database 2001, but for the periods covered in this study, it is supplemented with expendable bathythermograph data from the Bureau of Meteorology Research Center/Commonwealth Scientific and Industrial Research Organisation, conductivity temperature depth (CTD) casts from the Pacific Marine Environmental Laboratory, and data from the Global Temperature and Salinity Profile Program. All these data are checked for duplicity and are quality controlled. Quality flags are based on checks for spikes, location errors, and vertical consistency and on background comparisons and buddy checks. Superobservations are created for moorings with high-frequency output. Only data that are flagged “good” or “probably good” are assimilated.

Only observations corresponding to model grid points where the bottom is below 500 m are assimilated. Observations below 800-m depths are discarded because no serious effort is made to simulate the dynamics of the deeper layers. To increase the speed of the calculations, the profiles are subsampled in the vertical, retaining only two samples per model layer. From the moment of introduction of altimetry data in 1992, the temperature and salinity profile dataset is further subsampled in the horizontal, retaining no more than one profile per model grid box. Although the numbers vary during the run, typically 9000 ± 1000 temperature observations and 2500 ± 800 salinity observations are assimilated every 10 days. Only observations located within a limited region surrounding each analysis grid point are selected for the updates, as will be discussed in section 5. The assigned data error variance is the greater value of the assumed processed data error (1°C for temperature and 0.17 psu for salinity) and the effective data constraint error estimated as described in the next section.

4. Data constraint error

The estimation of appropriate data errors follows the method and terminology of Fu et al. (1993), later extended by Fukumori et al. (1999).

The idea underlying this approach is that a model is only able to represent a limited fraction of the total number of degrees of freedom of the real ocean. For example, only variability on spatial scales greater than a few gridbox lengths can typically be reliably reproduced by finite difference schemes. Even on these scales, however, the model state m is likely to differ from the real ocean as a result of errors in parameterizations and in the model forcing. It is therefore useful to introduce the concept of a “true” model state s, defined as the projection or mapping of the true ocean state onto the space spanned by all realizable model solutions. Considering the intended application of the ocean analyses as initial states for seasonal forecasts, the aim of data assimilation here is estimation of the true model state. (An alternative view of data assimilation, perhaps appropriate in nowcasting applications, is that of a physics-based interpolation of the data.)

While the difference between the observations y and the true ocean state is the measurement error, the difference with the true model state s contains an additional contribution, the “representation error,” which is associated with the incomplete representation of the observed quantity by the model. The effective data constraint error r is therefore the sum of representation and measurement errors. In many ocean applications, the former can be much larger than the latter. In a similar fashion, a particular model solution m can be written as the sum of the true ocean state and the model error p.

Fu et al. (1993) suggested a method to obtain estimates for the data constraint error and applied it within the context of the assimilation of altimetric sea level data. Given a long unconstrained model integration in conjunction with the observations available over the same time period, the data and model variances and the cross covariances between them can be determined according to
i1520-0493-135-1-125-e1
i1520-0493-135-1-125-e2
i1520-0493-135-1-125-e3
from which expressions for the data constraint error variance and the error variance of the control state can be derived as
i1520-0493-135-1-125-e4
i1520-0493-135-1-125-e5
In deriving these equations, it is assumed that all errors are additive, have zero mean, and are uncorrelated with each other. By removing the mean of each record prior to the estimation of the variances, it is ensured that they only represent the time-dependent part of the data error.

The above approach is used here to obtain estimates of data constraint error variances for altimetric sea level measurements as well as in situ measurements of temperature and salinity. All in situ data between January 1987 and December 1999 are averaged on a monthly basis in 1° bins and within standard depth layers. Corresponding monthly mean model values over the same period are obtained from the output of the control run. Figure 1 shows the resulting estimates of the data constraint error (〈rrT〉)1/2 for temperature and salinity observations at different depth levels. Error estimates are given for every grid box that contains more than one observation over the 13-yr period. Increased values are associated with regions characterized by strong internal variability. In these regions, the model contains too little energy, which can primarily be related to the fact that the model does not properly resolve small scales. However, large errors may also point to regions where the model forcing has large errors. Large values for the data constraint error variances extend to deeper layers only in regions where the ocean variability has a strong barotropic component, such as the western boundary currents. The errors in the Tropics, on the other hand, are largely confined to the surface and thermocline depths, illustrated by the fact that the highest temperature error values are found at shallower depths when going eastward along the equator, following the shoaling thermocline.

Observations of salinity are much sparser than observations of temperature. The World Ocean Circulation Experiment sections clearly stand out as the main sources for salinity observations away from the busy shipping routes, which are primarily located in the Northern Hemisphere.

While the above method appears to produce useful estimates for the time-varying error component, it does not account for constant biases between observations and model. While this is no problem when assimilating anomalies, as is the usual practice with sea level data, significant biases between model and data exist for subsurface quantities such as temperature and salinity. As with the time-varying error components, such biases can often be related to the use of a coarse grid (a typical example being an incorrect Gulf Stream separation point) or to systematic errors in the forcing.

It is difficult to reproduce and maintain the correct mean structure and position of a strong boundary current with a coarse-grid model, which will tend to assume its own preferred circulation structure. It would therefore make sense to relax the data constraint further in such regions. One way of doing this is to lower the relative weight given to the data by increasing the error variances (an infinite error variance is equivalent with not using the data at all). Weights can, for example, be assigned adaptively as a function of the innovation magnitude. Tests with this approach in a 4DVAR system were found to lead to undesired features in the analysis (A. Weaver 2005, personal communication). Given also that the use of such weighting will be somewhat arbitrary, an alternative method is used here that consists of adding the squared bias to the data constraint error variance defined above. Ideally, one would like to have an adaptive measure of the evolution of the bias during the assimilation run (e.g., Dee and Da Silva 1998). This would allow for a reduction of the error variance when the mean misfit between data and ensemble mean becomes smaller. No such adaptive bias correction scheme is used here, but the bias contribution to the total data error variance is reduced to a tenth of its original value after a few years in the run apparently without causing large problems.

5. Analysis algorithm

The EnKF (Evensen 1994, 2003) applies the Kalman filter equation simultaneously to a model ensemble of finite size N. The forecast error covariance matrix 𝗣 is the same for all ensemble members and is calculated as the spread around the ensemble mean,
i1520-0493-135-1-125-eq1
where 𝗔n×N = (ψ1, . . . , ψN) holds the ensemble of n-dimensional state vectors and primes indicate anomalies. For the resulting analysis ensemble to have the correct spread, the observations should also be treated as stochastic quantities, which is achieved by using an ensemble of randomly perturbed observations (Burgers et al. 1998). In practice, the use of a finite-sized ensemble has two important consequences: first, only N directions of the solution space can be sampled, whereas in many practical applications, the number of effective degrees of freedom of the model is many times greater, and second, because errors in ensemble-based variance estimates decrease proportional to 1/(N)1/2, the accuracy of the Kalman gains and model state error covariances will be limited for practical ensemble sizes. An approach that addresses both problems to some extent is commonly known as “local analysis.” In calculating the analysis for each individual grid column (which is done in parallel), the weights (covariances) given to observations that lie far away from the grid point are set to zero (the “global analysis” uses all observations for each grid point). The proper way to apply this “covariance localization” involves the Schur product of the empirical covariances with an analytical correlation function C with local support (Gaspari and Cohn 1999). This method was applied by Houtekamer and Mitchell (2001) and by Keppene and Rienecker (2002) in atmospheric and oceanographic applications, respectively. Including the Schur product, defined by
i1520-0493-135-1-125-eq2
the final form of the standard EnKF equation becomes
i1520-0493-135-1-125-e6
where C is Eq. (4.10) of Gaspari and Cohn (1999) and 𝗗 = d + 𝗘 is the sum of the data vector d and an ensemble of observation perturbations 𝗘. The observation error covariance matrix 𝗥 defines the data constraint error variances as determined in the previous section, as diagonal elements and zero values off the diagonal (twin experiments with different formulations for 𝗥 showed that a diagonal form yielded the best results). Following Houtekamer and Mitchell (2001), the order of application of the Schur product and measurement operator has been interchanged. Following Keppenne and Rienecker (2002), the half axes of the local support ellipse are chosen as 30° in the zonal direction and 15° in the meridional direction, respectively, for sea level, while twice these values were used for temperature and salinity data (Fig. 2). A vertical decorrelation scale of 500 m is chosen. The three types of observations are assimilated sequentially, which is justified when the data errors are uncorrelated. This condition is met for the instrument errors, but the situation is less clear for the data constraint errors estimated here. For example, resolution issues are found to affect all three observed variables in the midlatitude western boundary currents. However, as a first step, it will be assumed in the following section that all data errors are uncorrelated.

6. Results

a. Introduction

The control run for this hindcast experiment is started from a resting state and from a prescribed density distribution taken from the Polar Science Center Hydrographic Climatologies (Steele et al. 2001). A model spinup is performed by integrating the model forward for 5 yr (1978–82) using ERA-40-based climatological forcing, followed by 4 yr (1983–86) using daily ERA-40 forcing. A model ensemble spinup is run for 2 yr using the January 1985 control state as an initial condition for all members. Each member is subsequently forced by differently perturbed ERA-40 fields, as described in section 2.

The first analysis is produced for 8 January 1987, and successive assimilation steps follow each 10-day forward integration (referred to as “forecasts”). During the first 5 yr and 9 months, only temperature and salinity profiles are assimilated. Sea level observations from TOPEX/Poseidon and ERS-1 (replaced by ERS-2 in May 1995) are additionally assimilated from 10 October 1992 onward. The final analysis is produced for 28 December 1996.

The verification and analysis of the results follow two separate approaches; the consistency between data and model, given their uncertainties, will be verified using the instantaneous ensemble statistics, as well as the average statistics obtained by comparing the ensemble mean with the data over the 10 yr. The analyses are furthermore compared with the control run in order to judge whether the assimilation has brought the model closer to the data. Finally, the results are compared with independent data that are not used in the assimilation.

b. Time series of ensemble statistics

The Kalman filter equations show that the uncertainty in the analysis will be smaller than that in the forecast if the assimilated observations have a finite error variance. When using an ensemble, these uncertainties are assumed to be proportional to the spread of the ensemble after and before assimilation. Figure 3 shows time series over the 10-yr run of the mean ensemble spread of the forecast and analysis ensembles at the locations of the observations (the values for the tropical Pacific are shown). The figure shows that uncertainty is reduced at each assimilation step throughout the run (the analysis spread is consistently smaller than the forecast spread), as expected from the theory. Furthermore, the forecast spread tends to be larger than the analysis spread at the previous time step, and shows no signs of ensemble collapse, suggesting that the forcing perturbations are effective in maintaining the forecast uncertainty at a reasonable level. The figure for salinity is much noisier than that for temperature and sea level because of the smaller number of observations over which to average.

Three periods can be distinguished in the temperature time series. From 1987 to mid-1992, the ensemble maintains an approximately constant level of spread. A clear reduction of ensemble spread at the locations of the temperature observations is seen after the introduction of sea level data in October 1992. The spread increases again in the middle of 1993. This increase was found to be due to an erroneous misspecification of the data error by a factor of 2 from this point onward. While this has brought the analysis error diagnosed from the ensemble spread back to the prealtimetry level, the forecast spread typically remains at lower levels than during the 1987–92 period. The time series of ensemble spread for temperature (subsurface observations only) strongly resembles that of the spread in sea level. This suggests that the sea level assimilation has the strongest impact on the ensemble, with the temperature and salinity assimilation only adding relatively small additional corrections.

The magnitude of the corrections themselves is quantified in Figs. 4 and 5. Figure 4 shows the typical magnitude of the innovations (in terms of the root-mean-square of the observation–model misfits), which were determined for control, forecast, and analysis at each assimilation time step. Again the focus is on the tropical Pacific. Looking at the temperature innovations first, it is clear that the differences with the data are largest for the control and smallest for the analysis, with the forecast values somewhere in between. This picture is consistent throughout the run, although the differences between control and analyses become increasingly smaller toward the end of the run. The magnitude of the innovations remains more or less at a constant level. A slight reduction is seen in the difference between forecast and analysis after the first stages of the run, during which the system adjusts to a state that is more consistent with the data. A large reduction in the magnitude of the innovations is seen in the sea level record, although the initial drop is followed by a steady increase toward the end of the run. The fact that this behavior is also seen in the control innovations suggests that the analysis ensemble is not very strongly constrained by the data.

A further measure of the performance of the ensemble is presented in Fig. 5, which shows the time series of the average of the innovations themselves (as opposed to the average of the RMS values). This quantity may point toward systematic large-scale differences between the model and the data. It is possible that the tropical Pacific contains smaller regions with persistent biases of opposing sign that cancel in the average over the entire domain, but the time series may still identify long-term trends in offsets between model and data. A persistent bias is seen in the control temperatures, which is not as prominent in the ensemble results, suggesting that the assimilation has produced analyses with a better overall mean subsurface state than the control. After the introduction of sea level data, however, a slight bias of opposing sign appears in the analysis temperature. Note that this bias results from the assimilation of sea level anomalies rather than absolute values and is not reproduced in the sea level innovation time series. This suggests that it is actually the mean that causes the bias. Keppenne et al. (2005) found that such biases occur because of the change in model time-mean during the run in response to the assimilation. The model time-mean is typically determined a priori from the control run. At the time of introduction of the sea level anomaly data, however, the ensemble has already adjusted through temperature and salinity assimilation to a state that is significantly different from the control run. The control mean is therefore no longer appropriate as a reference to determine the model sea level anomalies. If the model is confronted with consistently positive (negative) innovations, the assimilation will attempt to correct this by raising (lowering) the temperatures. An adaptive bias correction scheme, as implemented by Keppenne et al. (2005), does therefore appear to be an appropriate solution. Other aspects of assimilation-induced bias are further discussed in section 7.

c. Rank histograms

A further indication of the consistency between model and data can be obtained using so-called rank histograms (see, e.g., Hamill 2001). This diagnostic was originally developed for the verification of ensemble forecasts and provides an indication of the reliability of the ensemble; if an event has a certain probability of occurring in reality, the ensemble forecast should suggest the same probability. This can be verified by repeatedly tallying the rank of the verification (in principle, the true state) relative to the sorted (from low to high) ensemble. If the ensemble is reliable, in the sense defined above, the observation and ensemble members can be considered random samples of the same probability distribution, which would result in an equal chance for any rank to occur, reflected by a flat histogram. Deviations from a flat histogram are an indication of problems with the ensemble. For example, a bias in the ensemble will cause the verification to appear at one side of the sorted ensemble more often than on the other side, resulting in a shifted and sloped rank histogram. Undervariability of the ensemble has a similar effect, but with equal chances of the verification appearing on either end of the sorted ensemble, leading to a U-shaped histogram. Figure 6 shows rank histograms based on the ensemble forecasts, using the data as verification. Hamill (2001) showed that when the verification has a known uncertainty, a random error should be added to each ensemble member before determining the rank. In this case, the estimated data constraint errors for temperature, salinity, and sea level anomalies were used. The histograms were determined separately for the three types of observations and for different geographical regions. Figure 6 shows that in all cases, there is a relative overpopulation of the middle ranks, typically indicating an excess of variability in the ensemble. However, when a random data error is added to the ensemble members, as is the case here, it may also indicate an overly pessimistic data error estimate. No indication of bias is found for the sea level verification, which is not surprising since sea level anomalies rather than absolute values are compared. For temperature, and in particular salinity, however, there are indications of offsets between the model ensemble and the data (in the North Atlantic in particular). There are also regions, such as the tropical Indian Ocean, where a U-shaped appearance of the histogram is found in combination with an apparent excess of ensemble variability. The high number of extreme ranks may indicate different biases in subregions or large changes in the bias during the run.

d. Subsurface states

A complementary picture of some of the apparent improvements and inconsistencies identified in the previous section can be obtained by comparing the three-dimensional ocean states from the control and assimilation run (the analyses) with the data. Time-constant and time-dependent model–data differences are considered.

First of all, the temperature biases in the control and analyses are shown in Fig. 7. Differences between the data and the control run show a fully three-dimensional pattern. Surface temperatures are predominantly too low in the control run. Larger biases are associated with the displacements or strength of strong currents, most notably the North Equatorial Current and the North Equatorial Countercurrent in the eastern tropical Pacific and several of the major western boundary currents, such as the Gulf Stream, the Kuroshio, and the confluence of the Malvinas and Brazil Currents. These latter displacements are also found in the salinity signal and extend to large depth, reflecting the strongly barotropic character of these currents. The salinity bias is mainly confined to the surface (only the plot for 50-m depth is shown here) and has a very large-scale character. Of particular interest are the strong biases in salinity of the East Greenland Current, the tropical and North Pacific, and the subpolar gyre of the North Atlantic. Figure 7 also shows the analysis-control bias. In many places, the patterns have the same sign and shape as for the data-control bias but with smaller amplitudes, indicating that the assimilation has reduced the bias in the correct direction, but only partly. The correction of mid- to high-latitude biases is less effective than the correction of low-latitude biases, because only data equatorward of 50°N and 40°S have been assimilated. Also, weaker but large-scale patterns appear to have been properly adjusted, with the exception of the large data-control bias maximum in the central North Pacific. The assimilation has increased the salinity in the North Atlantic with respect to the control, but has done this just east and southward of where the data suggests it should have been increased. The distribution of salinity data is insufficient to judge whether the strong increase in surface salinity just west of Mexico’s Baja California peninsula is realistic. Curiously, it appears to be strongly related to a temperature bias correction at the 150-m depth instead of the 50-m depth.

The time-dependent improvements in the analyses are investigated in Fig. 8, which compares the model error (〈ppT〉)1/2 [see Eq. (5)] from the control with the corresponding estimate based on the analyzed states. The reader is reminded here that these errors do not reflect differences with the real ocean, but with the true model state, as discussed in section 4. Time-dependent temperature and salinity errors in the control (as far as they can be detected from the limited data) are primarily associated with thermocline variability in the Tropics, and to a lesser extent with the western boundary currents. Increased error levels are found at the depth of the tropical Pacific thermocline, which shoals from west to east. The corresponding figures for the analysis error estimates show only a weak reduction of the major errors found in the control on the order of 0.5°–1°C. The improvement in the sea level simulation is much more significant, however. Time-dependent sea level errors in the control run have typically been reduced by half. The large-scale patterns of sea level error in the Southern Ocean can be associated with high-frequency wind-driven barotropic motions in the ocean (Stammer et al. 2000). Because these signals are not resolved by the altimetry, which has a relatively low-frequency sampling pattern, the data error estimation method as described by Eqs. (4) and (5) incorrectly ascribes this variability to errors in the model.

e. Independent data

A final test of the performance of the assimilation system is a comparison with data that have not been used in the assimilation. First, temperature and salinity profiles that were withheld from the assimilation are used to see if the assimilation has also improved the ocean state at the locations of the withheld data. Withheld profiles over the period 1990–94 are used to calculate differences with both the control and the analyses. Typically, 500 ± 100 observations of salinity and 2000 ± 1000 observations of temperature extending to 800-m depth are used for this task at each assimilation time step. Every 10 days, the RMS innovation values are computed for different geographical regions. The mean of these values over the 5-yr period is subsequently determined for each region. The results for eight tropical and extratropical regions in the Pacific are summarized in Fig. 9. It is found that the assimilation has reduced the time-variable part of the innovations everywhere by approximately 15%, both for temperature and salinity. This test thus shows that the adjustments produced by the EnKF are not limited to the locations of the assimilated data but improve the entire model state.

A second test that is performed here uses completely independent data. Zonal velocity profiles measured by current meters at several Tropical Atmosphere Ocean (TAO) moorings are used here to check whether the analyses are better than the control. Figure 10 shows comparisons for the zonal velocity component at two sites on the equator. The first of these two, at 165°E in the western Pacific, contains a significant seasonal signal and relatively little high-frequency variability, as opposed to the second site at 220°E in the central Pacific, where velocities are also typically 1.5–2 times higher. Both the control and analyses capture the main characteristics of the current variability, which is strongly controlled by large-scale wind stress variability. A clearer picture of the differences between the data, control, and assimilation is obtained from estimates of the constant bias and time-dependent differences between the model states and the data. These are summarized in Fig. 11 for all depths at which data are available. The assimilation has shifted the entire velocity profile at 165°E to lower values. In general, this has led to offsets that are smaller in the assimilation than in the control, with the exception of the surface and 200-m depth. The uniform character of the shift suggests that the change is perhaps more strongly related to large-scale adjustments of the equatorial circulation than to the impact of assimilation near this particular site. The assimilation has left the time-dependent differences between model and data more or less unaltered, as illustrated by the RMS plot. Correlations with the data are 0.68 for the analysis and 0.69 for the control.

A very similar picture appears for the site at 220°E, the main difference being that only velocities below 50 m are affected by the assimilation. The changes at lower depths represent a reduction in the strength of the Equatorial Undercurrent (EUC). In this case, the mean assimilated state is further away from the data than the mean control state. The correlation with the data has improved very slightly, from 0.62 for the control to 0.65 for the analysis. Correlations for other profiles and depths showed differences of similar magnitude but of varying sign.

The assimilation of temperature profile data is known to affect the position and strength of the EUC negatively in many systems because of its dependence on the delicate balance between the zonal pressure gradient and surface wind stress. The adjustment of the model to high-quality data tends to disturb the balance with a relatively poorly known surface stress leading to spurious circulations. Bias correction schemes such as developed by Bell et al. (2004) may be able to counter unwanted negative impacts of assimilation on the equatorial velocity profile.

7. Summary and conclusions

A global ocean data assimilation system is presented and tested during a 10-yr hindcast experiment. In this system, the MPIOM is used to produce a 64-member ensemble of ocean forecasts that is combined with in situ profiles of temperature and salinity and along-track altimetric sea level observations using a localized version of the EnKF.

Three-dimensional data constraint error fields have been estimated for all three observables using a statistical method that compares the model states of an unconstrained control run with all available data over the same period. This method is successful in identifying representation errors in regions of the global ocean characterized by small-scale variability that cannot be resolved by the model. It also allows for the estimation of the time-dependent error variance of the control run. While this method has been used previously in sea level assimilation applications, it is found to produce very reasonable patterns and magnitudes for temperature and salinity profiles as well, and is therefore a good alternative for some of the more ad hoc methods that are common practice. Systematic differences between the control run and data are accounted for by adding a term to the data constraint error, which is reduced during the run.

A forcing perturbation method is used that represents uncertainty in four surface forcing fields by quasi-random combinations of the dominant orthogonal modes of the ERA-40–NCEP–NCAR reanalysis differences over 1992–93. The quasi-random character is imposed through temporal correlations between daily perturbations.

Results from the hindcast experiment show that the forcing perturbations are effective at maintaining ensemble spread throughout the run. Time series at the observation locations show a reduction and subsequent rebound of ensemble spread that can be associated with the introduction of sea level data and a misspecification of data error by a factor of 2, respectively. RMS values of the temperature innovations are smaller for the analyses than for the control run but the difference is reduced from 1993 onward. The similarity in the time series of RMS sea level innovations between the control and analyses suggests that the analyses are not very strongly constrained by the in situ data. This is thought to be due to the relatively small number of profile data in the tropical Pacific compared with the number of sea level data. The subsampling of the profile dataset may have contributed to this.

The impact of a weak data constraint can also be seen in a comparison of the constant and time-dependent three-dimensional error fields in the control and analyses. The systematic differences with the data are reduced by the assimilation but not removed entirely. The reduction of time-dependent errors, defined as differences with a “true” model state, is rather modest for temperature but significant for sea level, particularly in the Tropics.

Rank histograms, verifying the reliability of the ensemble based on the observations, suggest an excess of variability in the ensemble in several geographical regions for all three observables. It is argued, however, that this may also be due to an overly pessimistic estimate of the data error. This would be in agreement with the earlier observation that the assimilation has only produced fairly modest corrections and with the overspecification of errors during the last few years of the assimilation run. The rank histograms also suggest the presence of remaining model–data biases, which is consistent with the time-mean three-dimensional analysis error fields.

Comparisons with withheld temperature and salinity profiles indicate that the analyses are closer to these data than the control. This shows that the adjustments produced by the EnKF are not limited to the locations of the assimilated data but improve the entire model state.

Model comparisons with independent current meter data on the equator show limited improvement in the time-dependent part of the velocity profile, but reveal a shift in the mean profile between the control and analysis, indicative of an assimilation-induced bias.

The apparently dominant impact of the sea level assimilation on the ensemble will need to be reevaluated. One may prefer to place more weight on the in situ data. The strong indications that data constraint errors are too large in this hindcast experiment suggest that some tuning of the system should be done before it is applied to operational seasonal forecasting. A short run with reduced error values should suffice to detect remaining inconsistencies. The procedure that has been adopted here to account for biases between data and model is not entirely satisfactory. Keppenne et al. (2005) showed that the assimilation of altimetry changes the time-mean sea level that should be applied to determine the anomalies. In this case, the change in the time-mean due to the assimilation of temperature and salinity was the cause for a bias in temperature appearing after the introduction of sea level. Assimilation is also known to create a bias in the equatorial region through the disturbance of the balance between the near-surface pressure gradients and the surface wind stress. This is the most likely cause for an increase in the bias in the vertical velocity profile relative to current meter data from the central Pacific. Such biases were also found in identical-twin experiments with the EnKF, where model error is not an issue (Leeuwenburgh 2005). The implementation of recently developed bias correction schemes (Dee and Da Silva 1998; Bell et al. 2004) with ensemble methods should therefore be considered (Keppenne et al. 2005). The assimilation scheme that has been presented here could provide a convenient extension to seasonal forecast systems. The use of an ensemble allows for a probabilistic forecast rather than a deterministic one based on a single model. The next step would therefore be to test the impact of an improved ocean ensemble initialization on coupled model forecast skill.

Acknowledgments

This work was supported by the EU through its funding for the ENACT project. The author would like to thank Gerrit Burgers and Geert Jan van Oldenborgh (KNMI), Laurent Bertino (NERSC), Helmuth Haak (MPIfM), Noel Keenlyside (IfM), and Alberto Troccoli and Paul Dando (ECMWF) for their suggestions and support.

REFERENCES

  • Alves, O., and C. Robert, 2005: Tropical Pacific Ocean model error covariances from Monte Carlo simulations. Quart. J. Roy. Meteor. Sci., 131 , 36433658.

    • Search Google Scholar
    • Export Citation
  • Bell, M. J., M. J. Martin, and N. K. Nichols, 2004: Assimilation of data into an ocean model with systematic errors near the equator. Quart. J. Roy. Meteor. Soc., 130 , 873893.

    • Search Google Scholar
    • Export Citation
  • Borovikov, A., M. M. Rienecker, C. L. Keppenne, and G. C. Johnson, 2005: Multivariate error covariance estimates by Monte Carlo simulation for assimilation studies in the Pacific Ocean. Mon. Wea. Rev., 133 , 23102334.

    • Search Google Scholar
    • Export Citation
  • Burgers, G., P. J. van Leeuwen, and G. Evensen, 1998: Analysis scheme in the Ensemble Kalman Filter. Mon. Wea. Rev., 126 , 17191724.

  • Cooper, M., and K. Haines, 1996: Altimetric assimilation with water property conservation. J. Geophys. Res., 101 , 10591077.

  • Dee, D. P., and A. K. Da Silva, 1998: Data assimilation in the presence of forecast bias. Quart. J. Roy. Meteor. Soc., 124 , 269295.

  • Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99 , 1014310162.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 2003: The ensemble Kalman Filter: Theoretical formulation and practical implementation. Ocean Dyn., 53 .doi:10.1007/210236-003-0036-9.

    • Search Google Scholar
    • Export Citation
  • Fischer, M., M. Flügel, and M. Ji, 1997: The impact of data assimilation on ENSO simulations and predictions. Mon. Wea. Rev., 125 , 819829.

    • Search Google Scholar
    • Export Citation
  • Fu, L-L., I. Fukumori, and R. N. Miller, 1993: Fitting dynamic models to the Geosat sea level observations in the tropical Pacific Ocean. Part II: A linear, wind-driven model. J. Phys. Oceanogr., 23 , 21622181.

    • Search Google Scholar
    • Export Citation
  • Fukumori, I., R. Raghunath, L-L. Fu, and Y. Chao, 1999: Assimilation of TOPEX/POSEIDON altimeter data into a global ocean circulation model: How good are the results? J. Geophys. Res., 104 , 2564725665.

    • Search Google Scholar
    • Export Citation
  • Gaspari, G., and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions. Quart. J. Roy. Meteor. Soc., 125 , 723757.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., 2001: Interpretation of rank histograms for verifying ensemble forecasts. Mon. Wea. Rev., 129 , 550560.

  • Houtekamer, P., and H. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev., 129 , 123137.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P., H. L. Mitchell, G. Pellerin, M. Buehner, M. Charron, L. Spacek, and B. Hansen, 2005: Atmospheric data assimilation with an ensemble Kalman filter: Results with real observations. Mon. Wea. Rev., 133 , 604620.

    • Search Google Scholar
    • Export Citation
  • Ingleby, B., and M. Huddleston, 2005: Quality control of ocean profiles—historical and real time data. J. Mar. Syst., in press.

  • Kalnay, E., and Coauthors, 1996: The NCEP/NCAR 40-Year Reanalysis Project. Bull. Amer. Meteor. Soc., 77 , 437471.

  • Keppenne, C. L., and M. M. Rienecker, 2002: Initial testing of a massively parallel ensemble Kalman filter with the Poseidon isopycnal ocean general circulation model. Mon. Wea. Rev., 130 , 29512965.

    • Search Google Scholar
    • Export Citation
  • Keppenne, C. L., and M. M. Rienecker, 2003: Assimilation of temperature into an isopycnal ocean general circulation model using a parallel ensemble Kalman filter. J. Mar. Syst., 40–41 , 363380.

    • Search Google Scholar
    • Export Citation
  • Keppenne, C. L., M. M. Rienecker, N. P. Kurkowski, and D. A. Adamec, 2005: Ensemble Kalman filter assimilation of temperature and altimeter data with bias correction and application to seasonal prediction. Nonlinear Processes Geophys., 14 , 113.

    • Search Google Scholar
    • Export Citation
  • Leeuwenburgh, O., 2005: Assimilation of along-track altimeter data in the Tropical Pacific region of a global OGCM ensemble. Quart. J. Roy. Meteor. Soc., 131 , 24552472.

    • Search Google Scholar
    • Export Citation
  • Le Traon, P. Y., F. Nadal, and N. Ducet, 1998: An improved mapping method of multisatellite altimeter data. J. Atmos. Oceanic Technol., 15 , 522533.

    • Search Google Scholar
    • Export Citation
  • Marsland, S., H. Haak, J. H. Jungclaus, M. Latif, and F. Röske, 2003: The Max-Planck-Institute global ocean/sea ice model with orthogonal curvilinear coordinates. Ocean Modell., 5 , 91127.

    • Search Google Scholar
    • Export Citation
  • Stammer, D., C. Wunsch, and R. Ponte, 2000: De-aliasing of global high-frequency barotropic motions in altimeter observations. Geophys. Res. Lett., 27 , 11751178.

    • Search Google Scholar
    • Export Citation
  • Steele, M., R. Morley, and W. Ermold, 2001: PHC: A global ocean hydrography with a high-quality Arctic Ocean. J. Climate, 14 , 20792087.

    • Search Google Scholar
    • Export Citation
  • Uppala, S. M., and Coauthors, 2005: The ERA-40 re-analysis. Quart. J. Roy. Meteor. Soc., 131 , 29613012.

  • Weaver, A. T., J. Vialard, and D. L. T. Anderson, 2003: Three- and four-dimensional variational assimilation with a general circulation model of the tropical Pacific Ocean. Part I: Formulation, internal diagnostics, and consistency checks. Mon. Wea. Rev., 131 , 13601378.

    • Search Google Scholar
    • Export Citation

Fig. 1.
Fig. 1.

Data constraint errors, calculated using Eq. (8), for (left) temperature at 50, 150, and 400 m and for (right) salinity at 50 and 150 m and sea level (SL).

Citation: Monthly Weather Review 135, 1; 10.1175/MWR3272.1

Fig. 2.
Fig. 2.

Schematic of the (bottom) search ellipses for in situ data (large ellipse) and altimetry (small ellipse) and (top) the associated localization function applied to ensemble covariances for the analysis grid point located at 0°, 180°. The dots indicate the distribution of the TAO array buoys.

Citation: Monthly Weather Review 135, 1; 10.1175/MWR3272.1

Fig. 3.
Fig. 3.

Time series of ensemble spread before (F) and after (A) assimilation averaged over all tropical Pacific observations of (a) temperature (°C), (b) salinity (psu), and (c) sea level (cm), respectively. Vertical lines in the diagrams indicate dates for which no observations were available (salinity) or for which the original assimilation diagnostics were missing.

Citation: Monthly Weather Review 135, 1; 10.1175/MWR3272.1

Fig. 4.
Fig. 4.

Time series of the RMS of model–data misfits before (F) and after (A) assimilation and for the control(C), averaged over all tropical Pacific observations of (a) temperature (°C), (b) salinity (psu), and (c) sea level (cm).

Citation: Monthly Weather Review 135, 1; 10.1175/MWR3272.1

Fig. 5.
Fig. 5.

Same as in Fig. 4, but for time series of model–data misfits.

Citation: Monthly Weather Review 135, 1; 10.1175/MWR3272.1

Fig. 6.
Fig. 6.

Rank histograms for (left) temperature, (middle) salinity, and (right) sea level in six regions: TP (23°S–23°N, 120°–275°E), TI (23°S–23°N, 30°–110°E), TA (23S°–23°N, 290°–20°E), NP (23°–50°N, 120°–260°E), NA (23°–50°N, 290°–360°E), and SH (23°–40S, 0°–360°E). The horizontal axes represent the rank, ranging from 1 to 65, and the vertical axes represent the probability P (rank), ranging from 0 to 0.04.

Citation: Monthly Weather Review 135, 1; 10.1175/MWR3272.1

Fig. 7.
Fig. 7.

Mean difference over the experiment period between (left) data and control and between (right) analyses and control for temperature at the 50- and 150-m depth and for salinity at the 50-m depth.

Citation: Monthly Weather Review 135, 1; 10.1175/MWR3272.1

Fig. 8.
Fig. 8.

Time-dependent error (standard deviation) for the (left) control and (right) analyses for temperature at the 50- and 150-m depth and for sea level, calculated from Eq. (9).

Citation: Monthly Weather Review 135, 1; 10.1175/MWR3272.1

Fig. 9.
Fig. 9.

RMS of innovations with respect to observations of (left) temperature and (right) salinity that are not used in the assimilation. Values are averages over the period 1990–94 and are shown for the Niño boxes and the following additional regions: western subtropical North Pacific: WSTNP (10°–30°N, 120°–190°E); eastern subtropical North Pacific: ESTNP (10°–30°N, 190°–260°E); western subtropical South Pacific: WSTSP (10°–30°S, 143°–200°E); eastern subtropical South Pacific: ESTSP (10°–30°S, 200°–300°E); and western extratropical North Pacific: WXTNP (30°–60°N, 120°–190°E).

Citation: Monthly Weather Review 135, 1; 10.1175/MWR3272.1

Fig. 10.
Fig. 10.

Time series of thezonal velocity at (left) 0°, 165°E and 150-m depth and at (right) 0°, 220°E and 120-m depth from the control and analyses, compared with current meter measurements.

Citation: Monthly Weather Review 135, 1; 10.1175/MWR3272.1

Fig. 11.
Fig. 11.

Profiles of (left two panels) velocity bias and (right two panels) RMS differences between the control and data (dotted) and between the analyses and data (solid) at 165° and 220°E on the equator.

Citation: Monthly Weather Review 135, 1; 10.1175/MWR3272.1

Save