## 1. Introduction

This paper concerns the determination of error statistics for oceanic general circulation models (GCMs) and data. Error estimates are needed to understand model and data deficiencies and are prerequisite to data assimilation studies (Bennett 1992; Wunsch 1996). While a lot of effort already goes into providing error estimates for oceanic data (e.g., Stammer and Wunsch 1994), little is known, quantitatively, about the skill of GCMs. And yet, misspecification of GCM errors can have disastrous consequences on data assimilation results (Dee 1995). The major difficulty has been a lack of global oceanic datasets of sufficient quality and duration to characterize the error statistics. With the advent of the TOPEX/Poseidon altimeter, which provides stringent tests for GCM errors at the surface (Fu and Smith 1996; Stammer et al. 1996), and of other large-scale ocean observation systems (e.g., the TOGA–TAO array; Hayes et al. 1991), and with the proliferation of oceanic data assimilation studies (Malanotte-Rizzoli 1996), it becomes urgent to establish a quantitive framework in which to examine GCM errors.

The questions addressed here are the following. Given a particular observation system, what components of a GCM’s error structure can be determined? Is it possible to discriminate between data and model errors? Can the altimetric data resolve the internal GCM error structure? How much data is required? Finally, what is the accuracy of the resulting errors estimates? We ask the above questions in the context of a study wherein four years of TOPEX/Poseidon data, the first full year of Acoustic Thermometry of Ocean Climate (ATOC) data, and the Marshall et al. (1997a,b) GCM are used to estimate the large-scale (>1000 km), time-varying circulation and heat budget of the North Pacific (The ATOC Consortium 1998).

The problem of quantifying model errors using data is addressed with adaptive Kalman filters by the engineering community (for recent surveys see Isaksson 1988; Moghaddamjoo and Kirlin 1993). Some of these adaptive filter methods have been applied to meteorological (e.g., Dee et al. 1985) and, more recently, to oceanographic (Blanchet et al. 1997) data assimilation studies. No single method, however, is applicable to all situations, each method being a compromise between computational cost, convergence speed, and simplifying assumptions.

Following the suggestion of Blanchet et al. (1997), we started by testing the empirical algorithm of Myers and Tapley (1976) and a maximum-likelihood estimator inspired by Dee (1995). We concluded that neither method was well suited to our particular problem, the former because of slow convergence and the latter because of high computational cost (Chechelnitsky 1999). We adapt instead the offline approach proposed by Fu et al. (1993) (appendix A) and estimate system and measurement error covariance matrices by matching sample covariance matrices of GCM data residuals to their theoretical expectations. Our algorithm extends the approach of Fu et al. (1993) in the following ways: 1) we relax the assumption of independence between model simulation errors and the true state, 2) we use Green’s functions to obtain a solution, 3) we exploit time-lagged correlations in the data, and 4) we provide uncertainty bounds for the estimates.

The proposed covariance matching approach is similar to methods described by Shellenbarger (1966) and Belanger (1974) but we use GCM data residuals directly rather than the innovation sequence (i.e., residuals between data and successive Kalman filter estimates). Innovation sequence approaches have been preferred by the engineering community because they are more readily amenable to online applications and to the tracking of slowly varying statistics in small-dimensioned systems. When first guess error statistics are accurate, the innovations will be less correlated (in time) than GCM data residuals and, therefore, the available information will collapse into a small number of lag covariance matrices.

For the large-dimensioned systems of interest to oceanographic studies, however, it is preferable to work with GCM data residuals directly for the following reasons. First, sample covariances can be computed offline thus avoiding the computational burden associated with repeated integrations of the Kalman filter. Second, model and data error covariance matrices are linearly related to those of GCM data residuals. By way of contrast, the innovation sequence variants of the algorithm require linearization about some first guess error statistics and, therefore, convergence is not guaranteed (Moghaddamjoo and Kirlin 1993). Finally, GCM data residuals contain information about absolute error variances while the innovation sequence can be used only to determine the relative ratio of model and data error variances. Although relative error ratios suffice for time stepping the Kalman filter, absolute error variances are required for obtaining a posteriori error statistics.

The remainder of this article is organized as follows. A statistical description of the problem and of the covariance matching algorithm appear in section 2. In section 3, the circulation and measurement models are introduced, and covariance matching is tested in a series of twin experiments. In section 4, fields diagnosed from the GCM are compared to TOPEX/Poseidon and ATOC data and the residuals are used to estimate trends, annual cycles, and error covariance matrices. Conclusions are set forward in section 5.

## 2. Statistical modeling

**p**(

*t*) represent GCM simulation errors,

**p**

*t*

**x**

_{GCM}

*t*

**x**

_{ocean}

*t*

*t*between

**x**

_{GCM}(

*t*), the prognostic variables of a GCM, and

**x**

_{ocean}(

*t*), the true state of the ocean sampled in a manner consistent with

**x**

_{GCM}. We model the dynamical evolution of the errors as

**p**

*t*

**A**

*t*

**p**

*t*

**B**

*t*

**q**

*t*

**A**

*t*) is the state transition matrix and

**q**(

*t*) are system errors, that is, errors in initial and boundary conditions, indeterminate GCM parameters, and other model errors;

**q**(

*t*) is projected onto the GCM grid by matrix

**B**

*t*). The difference between GCM predictions and oceanographic observations,

**y**

_{ocean}(

*t*), can be expressed as a noisy linear (or linearized) combination of

**p**(

*t*),where

**H**

*t*) is the measurement matrix and

**r**(

*t*) are data errors. In addition to instrument noise,

**r**(

*t*) includes representation errors (Fukumori 1999; Cohn 1997), that is, real oceanic signal not represented by the GCM, for example, tides and scales smaller than those resolved by the model.

Vectors **q**(*t*) and **r**(*t*) are taken to be random variables and are described by their means, 〈**q**(*t*)〉 and 〈**r**(*t*)〉, and by their covariance matrixes, **Q***t*) ≡ cov**q**(*t*) and **R***t*) ≡ cov**r**(*t*), where the covariance operator is defined in the usual way, cov**q** ≡ 〈(**q** − 〈**q**〉)(**q** − 〈**q**〉)′〉, 〈 · 〉 is the expectation operator, and prime indicates the transpose. This is a complete statistical description of the errors if the random vectors **q**(*t*) and **r**(*t*) have multivariate normal distribution (e.g., Mardia et al. 1979), that is, if the errors can be modeled as resulting from a set of stationary Gaussian processes. If the errors are non-Gaussian, the mean and covariance remain useful, though incomplete, descriptors. Our objective is to use measurements **y**(*t*) to estimate 〈**q**(*t*)〉, 〈**r**(*t*)〉, **Q***t*), and **R***t*) (see Table 1 for a summary of the notation).

### a. The basic algorithm

**A**

**B**

**H**

**Q**

**R**

**A**

**B**

**H**

**A**

**B**

**q**(

*t*) and

**r**(

*t*) have zero mean and are independent of

**p**(

*t*),

**q**

*t*

**0**

**r**

*t*

**0**

**p**

*t*

**q**

*t*

**0**

**p**

*t*

**r**

*t*

**0**

**q**(

*t*) and

**p**(

*t*) is less restrictive than that used by Fu et al. (1993), who assumed the model simulation error to be independent of the true state, 〈

**p**(

*t*)

**x**

_{ocean}(

*t*)′〉 =

**0**.] For stable

**A**

**y**(

*t*)〉 = 〈

**p**(

*t*)〉 =

**0**; (2) and (5) imply that 〈

**q**(

*t*

_{1})

**q**(

*t*

_{2})′〉 =

**0**for

*t*

_{1}≠

*t*

_{2}. Finally we parameterize

**Q**

**R**

**Q**

_{k}and

**R**

_{k}are problem dependent. Ideally, they should approximate the leading spatial patterns, or eigenvectors, of the errors. In practice they are chosen based on physical intuition and using Occam’s razor, that is, a search for the simplest, physically plausible, and statistically consistent error model.

**P**

**p**

**APA**

**Q**

**A**

**P**

**Y**

**y**

**HPH**

**R**

**Y**

**Q**

**R**

*α*

_{k}in (6), (7). An elegant way to solve this system of equations is through the use of Green’s functions,

**G**

_{Y}_{k}, here defined as the response of measurement covariance matrix,

**Y**

**Q**

_{k}or

**R**

_{k}, that is,where

**P**

_{k}is related to

**Q**

_{k}by the Lyapunov equation (8). Rewriting

**Y**

**G**

**Y**

*k*

*α*

_{k}using any of several discrete linear inverse methods (e.g., Menke 1989; Wunsch 1996). To reduce computational cost, the column operator (⋮) in (11) can also represent an appropriate subsampling of matrices

**Y**

**G**

_{Y,k}

**A**

**H**

*α*

_{k}in (11) can be determined.

This completes a basic description of the estimation algorithm. We next consider a series of algorithmic refinements and the effects of relaxing some of the simplifying assumptions. One issue is whether **R****Q***α*_{k} in (11) can be resolved independently. In section 2b and appendix B we demonstrate that, under a very general set of conditions, **R****Q****Y****y**(*t*): the consequences of sampling uncertainty are discussed in section 2c and appendix C. The algorithm is illustrated with a small numerical example in section 2d. Systematic and time-correlated errors are considered in sections 2e and 2f, respectively. Section 2g deals with time-dependent models. Finally, section 2h discusses statistical consistency tests.

### b. Using lag-difference covariance matrices

**Y**

**Q**

**R**

**Q**

**R**

**P**

*s*difference iswhere it is assumed that 〈

**r**(

*t*

_{1})

**r**(

*t*

_{2})′〉 =

**0**for

*t*

_{1}≠

*t*

_{2}. As before,

**Y**

*s*-difference covariance matrices can be combined in an equation of type

**d**=

**G**

**;that is,Here,**

*α***G**

**D**

*k*

**D**

*α*

_{k}. Since

**D**

_{r}−

**D**

_{s}is independent of

**R**

*r*≠

*s,*it is possible to resolve a particular

**Q**

_{k}independently of

**R**

**Q**

_{k}is observable in the sense that

**HA**

^{s}

**Q**

_{k}

**A**

^{s}

**H**

**0**for some

*s*≥ 1.

**Y**

_{s}

**y**

*t*

*s*

**y**

*t*

**HA**

^{s}

**PH**

**Y**

_{s}rather than

**D**

_{s}, is addressed in the next section.

### c. Finite number of measurements

**Y**

_{s}and

**D**

_{s}are exact. In practice, a finite number of measurements is available and we work with sample estimates

**Ỹ**

_{s}and

**D̃**

_{s}: the sample covariance of

**y**(

*t*) iswhere

*T*is the total number of time steps andis the sample mean.

The first algorithmic modification required concerns the computation of Green’s functions. If *T* spans less than about 20 *e*-folding periods for each observable normal mode of linear system **p**(*t* + 1) = **A****p**(*t*), the steady-state limit given by the solution to the Lyapunov equation (8) will be inaccurate. A Monte Carlo approach can instead be used to estimate **P**_{k} by driving linear model (2) with random system noise generated using covariance **Q**_{k}; **P**_{k} is estimated by averaging over a large number of independent simulations, each with finite time span *T.*

**, to (13):**

*ϵ***d**

**G**

*α*

*ϵ***in (17) can be determined by minimizing the weighted least-squares cost function,**

*α**J*

*α*

*ϵ***R**

^{−1}

_{ϵ}

*ϵ*

*α*

*α*_{0}

**R**

^{−1}

_{α}

*α*

*α*_{0}

*α*_{0},

**R**

_{α}, and

**R**

_{ϵ}represent prior knowledge for 〈

**〉, cov**

*α***, and cov**

*α***, respectively.**

*ϵ*The uncertainty variance of a sample covariance is *O*[*σ*^{2}_{1}*σ*^{2}_{2}*ρ*^{2})/*p*], where *σ*^{2}_{1}*σ*^{2}_{2}*ρ* is the correlation coefficient, and *p* is the number of degrees of freedom, that is, the number of independent measurements (appendix C). It follows that for a given sample size, the smaller the variances, the more accurately sample covariances can be determined.

For example, in the twin experiments of section 3c, statistically significant error estimates are possible using lag-diference covariance matrices, **D**_{s}, but not with lag covariance matrices, **Y**_{s}. In those experiments the errors propagate slowly relative to the duration of a time step, that is, the state transition matrix is approximately identity, so that, for small values of lag *s,* (12) simplifies to **D**_{s} ≈ *s***HQH****R****D̃**_{s} therefore scales with the diagonal elements of (*s***HQH****R****Ỹ**_{s} scales with the diagonal elements of (**HPH****R****D̃**_{s} when **A****I****R****HPH**

### d. Numerical example

*α*

_{1},

*α*

_{2}, and

*α*

_{3}, respectively, in (20). Computing the Green’s functions associated with

**Y**

**D**

_{s}results in the following system of equations:The kernel matrix in (22) has rank 3 (singular values [24.6 3.7 0.9 0.0]′), which indicates that only three independent combinations of parameters

*α*

_{k}can be resolved. It turns out that the addition of

**D**

_{3}, or of higher lag covariance matrices, does not contribute new information. Rules regarding the total number of resolvable parameters are set forward in appendix B.

**= [1 1 0 1]′,**

*α**T*= 500. We seek to estimate

**using the simulated data and the recipe of section 2b. From inverse theory, only projections onto singular vectors of**

*α***G**

**=**

*α̃**λ*

*λ*is an arbitrary constant multiplying null space contributions;

*λ*cannot be determined without additional information. To set

*λ*we assume that there is a priori knowledge that the system error covariance matrix is diagonal, that is,

*α*

_{3}= 0. This assumption requires that

*λ*= −1.4 and hence that

*α̃***P**

_{α}≡

*α̃***P**

_{α}is a function of a priori covariance matrices

**R**

_{ϵ}and

**R**

_{α}in (18). Here

**R**

_{α}is the a priori covariance of parameter vector

**and the only a priori knowledge assumed is that**

*α**α*

_{3}= 0. Matrix

**R**

_{ϵ}describes the sample uncertainty of

**Ỹ**

**D̃**

_{s}. An estimate of

**R**

_{ϵ}, consistent with the available data, can be obtained using the expressions derived in appendix C:The solution uncertainty matrix is

*α̃***= [1 1 0 1]′ used to generate the simulated data. (Unless otherwise specified, uncertainty is reported using one standard deviation.)**

*α*From a set of numerical experiments, like that above, we conclude that the covariance matching method gives consistent and statistically significant estimates, provided the total number of available measurements is much greater than the number of parameters *α*_{k}, that is, *MT* ≫ *K* + *L,* where *M* is the length of the measurement vector, *T* is the number of time steps, and *K* + *L* is the total number of parameters in (6), (7). The requirement for a large number of observations per parameter is a direct consequence of the large uncertainty of sample covariance matrices.

What happens if instead of assuming *α*_{3} = 0, which is the condition used to generate the simulated data, it is instead assumed that *α*_{1} = 0? This assumption implies that *λ* = 0.09 in (23) and leads to a second solution *α̃**λ* and of other a priori assumptions. For this particular example, a second independent measurement at every time step would permit **Q****Q**

(MATLAB script files and functions that implement this example, and which can be customized for different applications, are available via anonymous FTP to gulf. mit.edu, IP Address 18.83.0.149, from directory pub/dimitri/GCMerror.)

### e. Systematic errors

Systematic errors, or biases, refer to the quantities 〈**r**(*t*)〉 and 〈**q**(*t*)〉. These errors are important because, even if very small, they can accumulate over long numerical integrations and degrade the predictive skill of a model. A first scenario is that of a stable, time-independent system, as before, but with 〈**r**〉 ≠ **0**, 〈**q**〉 ≠ 0. Notice that the estimators that have been developed for **R****Q****P**

**y**

**H**

**I**

**A**

^{−1}

**q**

**r**

**y**〉 is linearly related to the biases, 〈

**q**〉 and 〈

**r**〉. The sample mean,

**y**

**y**〉 with uncertainty (Anderson 1971, section 8.2),which, using (2), (4), and (8), reduces towhere

**Y**

**P**

**q**〉 and data bias 〈

**r**〉.

A second scenario, that of a gradual change, or trend, in the system error, is discussed in section 2g, which deals with time-dependent models.

### f. Time-correlated errors

So far we have assumed that measurement and system errors are uncorrelated in time, that is 〈**r**(*t*_{1})**r**(*t*_{2})′〉 = **0**, 〈**q**(*t*_{1})**q**(*t*_{2})′〉 = **0** for *t*_{1} ≠ *t*_{2}. The former condition is required to evaluate lag-difference covariance matrices (12), but it is not required to evaluate the data covariance matrix **Y**

**y**(

*t*) is linearly related to the same frequency component in

**q**(

*t*) and

**r**(

*t*),where the subscript

*a*indicates the complex annual cycle amplitude, that is,

**y**

_{a}=

**a**exp(

*iϕ*),

**a**is the amplitude,

*ϕ*is the phase,

*ω*= 2

*π*/year, and we have assumed a time step of 1 month in (29). It is straightforward to remove correlated signals at the annual period from model-data residuals (e.g., section 4). But without additional information, it is not possible to partition the annual cycle error between system and data errors.

### g. Time-dependent models

We consider two types of time dependence. The first type is “known” time dependencies in the linear models, **A****B****H****R****G****D***k***H***t*), is the treatment of acoustic time series of differing lengths in section 4.

*α*

_{k}(

*t*) in (6), (7). In principle, this situation can be addressed through piecewise estimates of

*α*

_{k}(

*t*) for periods that are short relative to the timescales of

*α*

_{k}. A better approach is to parameterize the time dependency and to estimate these parameters using all the available data. An example is the detection of a trend, 〈∂

**q**/∂

*t*〉 ≠

**0**, in the system error. From (25), and assuming 〈∂

**r**/∂

*t*〉 =

**0**, the first difference of

**q**(

*t*) is related to the first difference of

**y**(

*t*) by

**y**

*t*

**y**

*t*

**H**

**I**

**A**

^{−1}

**q**

*t*

**q**

*t*

**y**(

*t*+ 1) −

**y**(

*t*)〉 can be approximated using least-squares (or other suitable estimators) and in turn used to estimate the quantity 〈∂

**q**/∂

*t*〉 (e.g., section 4).

### h. Tests of consistency

**in (17), (18), to its expected a posteriori covariance,In addition, when**

*ϵ***Q**

**R**

The description of the algorithm is now complete. In the remainder of this article we illustrate the application of this algorithm, first with twin experiments (section 3) and then with real data (section 4), by estimating the large-scale (>1000 km) baroclinic errors in a particular implementation and linearization of a GCM.

## 3. Twin experiments

### a. Circulation and measurement models

The circulation and measurement models, described below, are common to both the twin and the real experiments. The GCM is that of Marshall et al. (1997a,b) integrated in a global configuration wih realistic topography and driven by surface wind and buoyancy fields obtained from twice-daily National Centers for Environmental Prediction (NCEP) meteorological analyses. Horizontal grid spacing is 1° and there are 20 vertical levels.

A linear, time-independent model for GCM errors in the North Pacific is constructed by systematically perturbing the GCM with large-scale temperature anomalies (Menemenlis and Wunsch 1997). The linear model is defined in a region bounded by 5°–60°N and 132°–252°E (Fig. 1). It operates on a reduced state vector that has 8° sampling in the horizontal, four vertical temperature empirical orthogonal functions (EOFs, see Fig. 2), and a time step of 1 month. In this representation, sea surface pressure errors in the GCM caused by barotropic or salinity effects, or by scales not resolved by the reduced state vector, become part of the measurement error **r**(*t*), and are described by covariance matrix **R**^{6} in the GCM to 512 in the linear model. Away from coastal regions, this reduced-state linear model describes the large-scale temperature perturbation response of the GCM with considerable skill for periods up to two years. Similar types of state reduction and linearization are commonly used for propagating the error covariance matrix in data assimilation studies (Fukumori and Malanotte-Rizzoli 1995; Cane et al. 1996).

The acoustic tomography data from ATOC are first inverted to produce equivalent range-averaged oceanographic temperature perturbations along each section (The ATOC Consortium 1998). Data–GCM discrepancy is then projected onto the four vertical EOFs and the monthly sampling of the reduced state vector described above. Therefore, the measurement matrix for acoustic tomography data consists of a range-average for each vertical EOF and for each section. Acoustic data from five sections (Fig. 1) are used for a total of 20 data points (projections onto the four vertical EOFs for each section), once per month.

The measurement matrix **H**

### b. Generation of simulated data

Before applying covariance matching to real data, we test the algorithm in a series of twin experiments using simulated data with known statistical properties. We parameterize **Q***α*_{1}, . . . , *α*_{4}, each representing system error variance associated with each of the four vertical EOFs, that is, we assume that the system error is horizontally homogeneous and white. The measurement error covariance, **R***α*_{5} and *α*_{6}, corresponding to the measurement error variance associated with acoustic tomography and altimeter data, respectively. The test data are generated using the reduced state linear model and the acoustic and altimetric measurement models, and by driving Eqs. (2), (4) with white system and measurement noise characterized by parameters *α*_{1}, . . . , *α*_{6}, as defined above.

### c. Tests with pseudoacoustic data

The first set of twin experiments is carried out with noise-free, **R****0**, simulated acoustic tomography data. It is both impractical, because of computational cost, and unnecessary, because of information overlap, to match all available lag-difference data covariance matrices as in (13). An appropriate subset of data covariance matrices must be selected by trial and error and by reference to the guidelines of section 2c, that is, a preference for sample covariance matrices with small matrix norms and hence smaller relative uncertainties. The sample uncertainties of **Ỹ****D̃**_{1}, and **D̃**_{2} are displayed in Fig. 3 as a function of number of years of simulated data. Note that **D̃**_{1} and **D̃**_{2} have smaller relative uncertainties than **Ỹ****D̃**_{1} or **D̃**_{2} will produce better estimates of **Q****R****Ỹ**

Figure 4 displays estimates of parameters *α*_{1}, . . . , *α*_{4}, based on matching **D̃**_{1}, as a function of number of years of simulated data. Error bars are obtained as in section 2d. Contrary to the empirical algorithm of Myers and Tapley (1976), which failed to converge for this twin experiment (see Chechelnitsky 1999), the present algorithm provides useful estimates of system error even with a single year of data.

The results of a series of tests based on 14 months of simulated data are summarized in Table 2 (at the time of this study 14 months of ATOC data were available). Each particular estimate is not expected to match the true variance of **Q****D̃**_{1}, provides the most accurate estimates, with mean standard uncertainty of 18% as compared to 38% for **Ỹ****D̃**_{1} leads to a standard uncertainty of 23% similar to that obtained by using the full lag-2 difference covariance matrix, **D̃**_{2}.

Next we report on results from a series of experiments with noisy measurements, **R****I****Q****D̃**_{1} is 52%. The uncertainty can be reduced by using several lag-*s* difference covariance matrixes simultaneously: using **D̃**_{1} and **D̃**_{2} simultaneously reduces the estimation uncertainty to 38%.

In summary, the estimation uncertainty decreases with increasing years of available data and with increasing ratio |**Q****R****Q****R****Q****R**

### d. Tests with pseudoaltimeter data

A third set of twin experiments is conducted using simulated altimeter data. In theory, it is possible to separate baroclinic modes in the altimeter data by making use of their different temporal evolutions at the sea surface (e.g., Holland and Malanotte-Rizzoli 1989). The results presented below, however, suggest that even with perfect measurements, **R****0**, and with perfect knowledge of the dynamical and measurement models, **A****H**

At the writing of this manuscript, 48 months of high quality TOPEX/Poseidon altimeter data were available. We therefore performed a further series of tests using 48 months of simulated altimeter data (see Table 4). Because of the large dimensions of the sample covariance matrices, only their diagonal elements have been matched. The first six rows of Table 4 correspond to estimates from matching **Y****D**_{1} through **D**_{5}. The last row summarizes results from matching all six data covariance matrices simultaneously. The standard errors for this last case range from 35% to 235%. The situation is worse when measurement errors are included. We conclude that covariance matrices for the vertical GCM error structure cannot, in the present setup, be quantified from TOPEX/Poseidon data alone.

## 4. Experimental results

### a. TOPEX/Poseidon data

The covariance matching approach is next applied to TOPEX/Poseidon altimeter data and to a particular integration of the Marshall et al. (1997a,b) GCM. Figure 6 compares measured sea level anomaly variance to that predicted by the GCM. Both the altimetric data and the GCM have been processed in a way consistent with the reduced state described in section 3a, that is, periods shorter than 2 months and length scales smaller than 16° have been low-pass filtered. In addition, annual cycles and trends have been removed at every location; these will be studied separately. Altimetric data and GCM output exhibit the same general patterns of enhanced variability near the Kuroshio, the Hawaiian Ridge, and in a band north of the equator. The GCM variability, however, is on average 30% less than that measured by the altimeter, and in some regions, notably in the eastern tropical Pacific, the altimetric and GCM time series are uncorrelated. The variance of the GCM–TOPEX/Poseidon residual (Fig. 6c) is 60% that of TOPEX/Poseidon, indicating that the GCM explains 40% of the observed low-frequency/wavenumber variability. Our objective is to determine which fraction of the GCM–TOPEX/Poseidon residual can be attributed to system error, **HPH****R**

The twin experiments conducted earlier indicate that it is not possible to determine covariance matrices for the vertical GCM error structure from four years of altimetric data. We therefore consider a number of statistical models for covariance matrices **Q****R****Q****R****Q****R**

To obtain statistically significant error estimates, it is necessary to reduce the number of parameters to be estimated. Therefore the second model considered is one of homogeneous and spatially uncorrelated system and measurement error, **Q***α*_{1}**I****R***α*_{2}**I****Ỹ****D̃**_{1}, **D̃**_{2}, and **D̃**_{3} yields *α̃*_{1}*α̃*_{2}**q**(*t*) and **r**(*t*) with variance 0.25 and 1.00, respectively. Assuming the statistical model chosen to be the correct one, the standard deviation of the Monte Carlo estimates represents a lower bound for the standard uncertainty of the real estimates. These estimates imply that on average 70% of the GCM–TOPEX/Poseidon residual variance can be explained by system error; that is, the ratio of the diagonal elements of **HPH****Y**

The homogeneous model, however, does not account for some of the regions of enhanced variability in Fig. 6c. A third plausible model is **Q***α*_{1}**Q**_{1} and **R***α*_{2}**R**_{1}, where **Q**_{1} and **R**_{1} are diagonal matrices with a spatially varying structure proportional to that of the GCM–TOPEX/Poseidon residual variance. Matching this model to the data yields *α̃*_{1}*α̃*_{2}

**x**

_{ocean}

**p**′〉 = 0 (see appendix A). When this assumption holds,On average, this third model predicts that 15% of the GCM–TOPEX/Poseidon residual variance is caused by system error (Fig. 7c). Although this relatively low value, compared to the earlier estimate of 60%, could point to a number of problems with the statistical model, the presumption is that to first-order condition 〈

**x**

_{ocean}

**p**′〉 =

**0**is violated and hence that the 15% estimate is wrong.

### b. ATOC data

We now turn our attention to the acoustic data. Figure 8 compares the GCM–ATOC residual, converted to an equivalent sea level anomaly, to the range-averaged GCM–TOPEX/Poseidon residual along each acoustic path, after removing trends and annual cycles. The acoustic data is used to estimate the vertical structure of the errors and to test noise model #3 from above, that is, **Q****Q**_{1}. We model **Q***α*_{1}, . . . , *α*_{4}, each representing system error variance associated with each of the four vertical EOFs, and with a spatial structure proportional to that of the GCM–TOPEX/Poseidon residual variance (Fig. 6c). Measurement and representation error for the acoustic data are modeled as **R***α*_{5}**I**

The cost function (18) is minimized assuming a priori estimates of 0.047 ± 0.047 for *α*_{1}, . . . , *α*_{4}, that is, the estimate obtained using TOPEX/Poseidon data but allowing for a larger uncertainty in order to test the vertical equipartition hypothesis. The a priori estimate for *α*_{5} is taken to be 0.28 ± 0.28, that is, the variance of the acoustic data with a corresponding uncertainty. A conservative estimate for the prior sample covariance uncertainty is **R**_{ϵ} = 0.28**I***α*_{1} = 0.15 ± 0.04, *α*_{2} = 0.00 ± 0.04, *α*_{3} = 0.11 ± 0.04, and *α*_{4} = 0.00 ± 0.04. These estimates differ from the altimetric estimate of 0.047 ± 0.006, indicating that the vertical equipartition hypothesis is not valid.

A solution that is simultaneously consistent with both TOPEX/Poseidon and ATOC data can also be obtained:*α*_{1} = 0.04 ± 0.03, *α*_{2} = 0.01 ± 0.02, *α*_{3} = 0.06 ± 0.03, and *α*_{4} = 0.12 ± 0.02. This solution differs from that using ATOC data alone in that it predicts less error variance associated with vertical EOF 1 and more with vertical EOF 4, that is, larger model errors above the seasonal thermocline (see Fig. 2). The differences are likely caused by different spatial and temporal extents for the ATOC and TOPEX/Poseidon data and by inaccuracies in the assumed statistical models. All three covariance matching solutions, however, whether from TOPEX/Poseidon data alone, from the ATOC data, or from their combination predict that about 60% of the GCM–TOPEX/Poseidon residual variance is explained by system error.

Figure 9a displays the mean vertical structure of residual errors along the ATOC acoustic sections. The dotted line indicates the mean standard uncertainty of the acoustic inversions (The ATOC Consortium 1998) and can be compared to the covariance matching estimate of *α*_{5} = 0.31 ± 0.03 (diamonds). Also displayed are the estimated GCM and system standard errors, **p**(*t*) and **q**(*t*), respectively. The acoustic data has limited depth resolution, being better suited to the measurement of top-to-bottom averages. Nevertheless, the data indicates significant errors in the GCM variability from about 100-m to 1000-m depth, with a maximum of 0.2°C at 300 m.

### c. Trend and annual cycle

Trends and annual cycles of the GCM-data residuals, which were excluded from the previous analysis, are discussed next. In the tropical Pacific, the GCM exhibits a warming trend relative to TOPEX/Poseidon data of up to 3 cm year^{−1} (Fig. 10). The acoustic data indicate that most of the warming occurs between the seasonal and main thermoclines, 50–1000-m depth, with a peak warming of 0.1° to 0.2°C year^{−1}, depending on location (Fig. 9b).

For most of the subtropical gyre, both the GCM and TOPEX/Poseidon exhibit maximum sea level anomaly in September (month 9), but the TOPEX/Poseidon amplitude is about 2 cm larger than that of the GCM (Fig. 11). As a result, the peak GCM–TOPEX/Poseidon residual occurs in March (month 3), six months out of phase with the GCM or TOPEX/Poseidon annual cycle. Excluding the surface layer, where resolution is poor, the acoustic data suggest that the annual cycle error is confined to a depth range shallower than 200 m, the phase-locked range in Fig. 9d, with a peak of 0.3°C at 120-m depth (Fig. 9c).

## 5. Summary and concluding remarks

The principal contribution of this study is the couching of the GCM error estimation problem in terms of familiar least-squares theory. The so-called “covariance matching” approach makes it possible to take advantage of a large number of tools from discrete linear inverse theory in order to study the statistical properties of the errors. The present study demonstrates that covariance matching is both a powerful diagnostic for addressing theoretical questions and an efficient approach for practical applications.

Assuming system and data errors to be uncorrelated from each other and from the oceanic state, theoretical questions can be addressed in the context of least-squares equations (17)–(18). For a particular GCM and set of measurements, the Green’s function matrix, **G****Q**_{k} of system error covariance matrix **Q****HA**^{s}**Q**_{k}**A**^{s}**H****0** for some *s* ≥ 1 (section 2b). At least *N* independent measurements per time step and two covariance matrices (from the set **Y****D**_{1}, **D**_{2}, . . . ) are required to fully resolve an *N* × *N* matrix **Q****Q****R**

A major obstacle to obtaining statistically significant results is the large uncertainty of sample covariance matrices, *O*(2*σ*^{4}/*p*) where *σ*^{2} is the variance and *p* is the degrees of freedom (appendix C). The sample uncertainty is represented by **R**_{ϵ} in (18) and standard least-squares tools can be used to evaluate the statistical significance of the error estimates (sections 2c and 2d). In general, the number of error covariance parameters, *α*_{k} in (6)–(7), which can be determined with a reasonable degree of statistical significance is two to three orders of magnitude smaller than the total number of independent data.

We illustrate the approach by applying it to a particular integration of the Marshall et al. (1997a,b) GCM, 56 months of TOPEX/Poseidon sea level anomaly data, and 14 months of acoustic tomography data from the ATOC project. The GCM is forced with observed meteorological conditions at the surface and integrated in a global configuration with 1° horizontal grid spacing and 20 vertical levels. A reduced state linear model that describes internal (baroclinic) error dynamics is constructed for the study area (5°–60°N, 132°–252°E).

Twin experiments, using the reduced state model, suggest that altimetric data are ill-suited to the estimation of covariance matrices for internal GCM errors, but that such estimates can in theory be obtained using the acoustic data (Figs. 4 and 5). These conclusions must however be qualified in the following way. First, the vertical modes used here are EOFs, not dynamical modes, and second, the tests were conducted using linearized GCM dynamics. We do not exclude the possiblity that dynamical modes or fully nonlinear dynamics could enhance the resolution of internal GCM errors from altimetric data.

The GCM exhibits a warming trend relative to TOPEX/Poseidon data of order 1 cm year^{−1} (Fig. 10) corresponding to a peak warming of up to 0.2°C year^{−1} in the acoustic data at depths ranging from 50 to 200 m (Fig. 9b). This trend measures GCM drift. At the annual cycle, GCM and TOPEX/Poseidon sea level anomaly are in phase, but GCM amplitude is 2 cm smaller (Fig. 11). The acoustic data suggest that the annual cycle error is confined to the top 200 m of ocean (Figs. 9c and 9d). These differences result from errors in the surface boundary conditions and in the dynamics of the GCM.

After removal of trends and annual cycles, the low-frequency/wavenumber (periods >2 months, wavelengths >16°) TOPEX/Poseidon sea level anomaly is order 6 cm^{2}. The GCM explains about 40% of that variance (Fig. 6). Assuming the error model used to be correct, it is estimated by covariance matching that about 60% of the GCM–TOPEX/Poseidon residual variance is consistent with the reduced state dynamical model (Fig. 7b). This conclusion appears to be relatively robust: it holds for a number of different vertical and horizontal error models, and it is supported by both the altimetric and the acoustic data. The acoustic data measure significant GCM temperature errors in the 100–1000-m depth range with a maximum of 0.3°C rms at 300 m (Fig. 9a). The remaining GCM–TOPEX/Poseidon residual variance is attributed to measurement noise, to barotropic and salinity GCM errors, and to vertical modes of temperature variability that are not represented by the reduced state model.

This and previous studies demonstrate that it *is* possible to obtain simple statistical models for GCM errors that are consistent with the available data. For practical applications, however, the GCM error covariance estimation problem is in general highly underdetermined, much more so than the state estimation problem. In other words there exist a very large number of statistical models that can be made consistent with the available data. Therefore, methods for obtaining quantitative error estimates, powerful though they may be, cannot replace physical insight. But used in the right context, as a tool for guiding the choice of a small number of model parameters, covariance matching can be a useful addition to the repertory of oceanographers seeking to quantify GCM errors or to carry out data assimilation studies.

## Acknowledgments

We thank C. Wunsch for providing scientific guidance and commenting on an early draft of the manuscript. We also thank C. Frankignoul and I. Fukumori for insightful discussions. Financial support was provided by SERDP/ARPA as part of the ATOC project (University of California SIO contract PO 10037358) and by NASA Grant NAGW-1048. MC was partially supported by a NASA Global Change Sciences Fellowship.

## REFERENCES

Anderson, B. D. O., and J. B. Moore, 1979:

*Optimal Filtering.*Prentice-Hall, 357 pp.Anderson, T. W., 1971:

*The Statistical Analysis of Time Series.*Wiley Series in Probability and Mathematical Statistics, John Wiley & Sons, 704 pp.ATOC Consortium, 1998: Ocean climate change: Comparison of acoustic tomography, satellite altimetry, and modeling.

*Science,***281,**1327–1332.Belanger, P. R., 1974: Estimation of noise covariance matrices for a linear time-varying stochastic process.

*Automatica,***10,**267–275.Bennett, A. F., 1992:

*Inverse Methods in Physical Oceanography. Cambridge Monogr. on Mechanics and Applied Mathematics,*Cambridge University Press, 346 pp.Blanchet, I., C. Frankignoul, and M. A. Cane, 1997: A comparison of adaptive Kalman filters for a tropical Pacific Ocean model.

*Mon. Wea. Rev.,***125,**40–58.Cane, M. A., A. Kaplan, R. N. Miller, B. Tang, E. C. Hackert, and A. J. Busalacchi, 1996: Mapping tropical Pacific sea level: Data assimilation via a reduced state space Kalman filter.

*J. Geophys. Res.,***101,**22 599–22 617.Chechelnitsky, M., 1999: Adaptive error estimation in linearized ocean general circulation models. Ph.D. thesis, Massachusetts Institute of Technology and Woods Hole Oceanographic Institution, 211 pp. [Available from Joint Program in Physical Oceanography, Massachusetts Institute of Technology, Cambridge, MA 02139.].

Cohn, S. E., 1997: An introduction to estimation theory.

*J. Meteor. Soc. Japan,***75,**257–288.Daley, R., 1992: The lagged innovation covariance: A performance diagnostic for atmospheric data assimilation.

*Mon. Wea. Rev.,***120,**178–196.Dee, D. P., 1995: On-line estimation of error covariance parameters for atmospheric data assimilation.

*Mon. Wea. Rev.,***123,**1128–1145.——, and A. M. da Silva, 1998: Data assimilation in the presence of forecast bias.

*Quart. J. Roy. Meteor. Soc.,***124,**269–295.——, S. Cohn, A. Dalcher, and M. Ghil, 1985: An efficient algorithm for estimating noise covariances in distributed systems.

*IEEE Trans. Autom. Control,***30,**1057–1065.Evensen, G., D. Dee, and J. Schröter, 1998: Parameter estimation in dynamical models.

*Ocean Modeling and Parameterizations,*E. P. Chassignet and J. Verron, Eds., Kluwer Academic, 373–398.Fu, L.-L., and R. D. Smith, 1996: Global ocean circulation from satellite altimetry and high-resolution computer simulation.

*Bull. Amer. Meteor. Soc.,***77,**2625–2636.——, I. Fukumori, and R. N. Miller, 1993: Fitting dynamic models to the Geosat sea level observations in the tropical Pacific Ocean. Part II: A linear, wind-driven model.

*J. Phys. Oceanogr.,***23,**2162–2181.Fukumori, I., 1999: Assimilation of TOPEX/Poseidon altimeter data into a global ocean circulation model: Are the results any good?

*J. Geophys. Res.,*in press.——, and P. Malanotte-Rizzoli, 1995: An approximate Kalman filter for ocean data assimilation: An example with an idealized Gulf Stream model.

*J. Geophys. Res.,***100**(C4), 6777–6793.Gajić, Z., and M. T. J. Qureshi, 1995:

*Lyapunov Matrix Equation in System Stability and Control.*Academic Press, 255 pp.Groutage, F. D., R. G. Jacquot, and R. L. Kirlin, 1987: Techniques for adaptive state estimation through the utilization of robust smoothing.

*Control and Dynamic Systems,*C. T. Leondes, Ed., Vol. 25, Academic Press, 273–308.Hayes, S. P., L. J. Mangum, J. Picaut, A. Sumi, and K. Takeuchi, 1991: TOGA–TAO: A moored array for real-time measurements in the tropical Pacific Ocean.

*Bull. Amer. Meteor. Soc.,***72,**339–347.Holland, W. R., and P. Malanotte-Rizzoli, 1989: Assimilation of altimeter data into an oceanic circulation model: Space versus time resolution studies.

*J. Phys. Oceanogr.,***19,**1507–1534.Isaksson, A., 1988: On system identification in one and two dimensions with signal processing applications. Ph.D. thesis, Linköping University, 249 pp. [Available from Dept. of Electrical Engineering, Linköping University, S-581 83 Linköping, Sweden.].

Malanotte-Rizzoli, P., Ed., 1996:

*Modern Approaches to Data Assimilation in Ocean Modeling.*Elsevier Oceanography Series, Vol. 61, Elsevier, 455 pp.Mardia, K. V., J. T. Kent, and J. M. Bibby, 1979:

*Multivariate Analysis.*Academic Press, 521 pp.Marshall, J., A. Adcroft, C. Hill, L. Perelman, and C. Heisey, 1997a:A finite-volume, incompressible Navier–Stokes model for studies of the ocean on parallel computers.

*J. Geophys. Res.,***102**(C3), 5753–5766.——, C. Hill, L. Perelman, and A. Adcroft, 1997b: Hydrostatic, quasi-hydrostatic and non-hydrostatic ocean modeling.

*J. Geophys. Res.,***102**(C3), 5733–5752.Maybeck, P. S., 1979:

*Stochastic Models, Estimation, and Control.*Academic Press, 423 pp.Menemenlis, D., and C. Wunsch, 1997: Linearization of an oceanic circulation model for data assimilation and climate studies.

*J. Atmos. Oceanic Technol.,***14,**1420–1443.——, P. W. Fieguth, C. Wunsch, and A. S. Willsky, 1997: Adaptation of a fast optimal interpolation algorithm to the mapping of oceanographic data.

*J. Geophys. Res.,***102**(C5), 10 573–10 584.Menke, W., 1989:

*Geophysical Data Analysis: Discrete Inverse Theory.*International Geophysics Series, Vol. 45, Academic Press, 285 pp.Moghaddamjoo, R. R., and. R. L. Kirlin, 1993: Robust adaptive Kalman filtering.

*Approximate Kalman Filtering,*G. Chen, Ed., World Scientific, 65–85.Myers, K. A., and B. D. Tapley, 1976: Adaptive sequential estimation with unknown noise statistics.

*IEEE Trans. Autom. Control,***21,**520–523.Shellenbarger, J., 1966: Estimation of covariance parameters for an adaptive Kalman filter.

*Proc. National Electronics Conf.,*Chicago, IL, National Engineering Consortium, 698–702.Stammer, D., and C. Wunsch, 1994: Preliminary assessment of the accuracy and precision of TOPEX/Poseidon altimeter data with respect to the large-scale ocean circulation.

*J. Geophys. Res.,***99**(C12), 24 584–24 604.——, R. Tokmakian, A. Semtner, and C. Wunsch, 1996: How well does a ¼° global circulation model simulate large-scale oceanic observations?

*J. Geophys. Res.,***101**(C11), 25 779–25 811.Wunsch, C., 1996:

*The Ocean Circulation Inverse Problem.*Cambridge University Press, 442 pp.

## APPENDIX A

### The Fu et al. (1993) Approach

**p**(

*t*) and

**r**(

*t*) are GCM and data errors, respectively, and

**y**(

*t*) is the GCM data residual. Multiplying each expression by its transpose and taking expectations yieldswhere it is assumed that

**r**(

*t*) is uncorrelated from

**x**

_{ocean}(

*t*) and from

**p**(

*t*). Fu et al. (1993) further assume that 〈

**x**

_{ocean}

**p**′〉 =

**0**and solve for

**R**

**HPH**

**x**

_{ocean}

**p**′〉 =

**0**, it is possible to evaluate the validity of this assumption:

## APPENDIX B

### Maximum Number of Resolvable Parameters

The following restrictions apply to the estimation of error covariance matrices **Q****R***α*_{k} in (6)–(7) that can be resolved is *M*(*N* + 1), where *M* ⩽ *N* is the number of independent observations at a given time step and *N* is the state dimension [the length of vector **p**(*t*)]. Second, the maximum number of parameters *α*_{k} describing **Q***M*(*N* + 1) − *M*(*M* + 1)/2. This restriction applies whether **R***α*_{k}, which can be resolved using *S* data covariance matrices (from the set **Y****D**_{1}, **D**_{2}, . . . ) is at most *SM*(*M* + 1)/2. Proofs are established for the covariance matching approach by computing the rank of the Green’s function matrix in (13) (Chechelnitsky 1999).

Consider an *N* × *N* system error covariance matrix **Q***M* × *M* measurement error covariance matrix **R****Q****R***N*(*N* + 1)/2 and by *M*(*M* + 1)/2 parameters, respectively. To fully resolve **Q****R***N* independent observations be available (i.e., *M* = *N*) and that at least two data covariance matrices (e.g., **Y****D**_{1}) be used.

## APPENDIX C

### Uncertainty of Sample Covariance

**Y**

_{(i,j)}(

*q*) denotes the (

*i, j*)th element of the lag-

*q*covariance matrix 〈[

**y**(

*t*+

*q*) − 〈

**y**〉][

**y**(

*t*) − 〈

**y**〉]′〉;

**Ỹ**

_{(i,j)}(

*q*) is the corresponding sample covariance. This formula is a generalization of the univariate expression derived by Anderson [1971, section 8.2, Eq. (64)]. The formula is exact for Gaussian time series. In practice, however, lag covariances in (C1) are replaced by sample estimates, leading to approximate solutions. A useful approximation iswhere

*p*⩽

*T*is the number of degrees of freedom (roughly, the number of time steps

*T*divided by the

*e*-folding correlation period). From (C2), the variance of a sample covariance is

*O*[

*σ*

^{2}

_{1}

*σ*

^{2}

_{2}

*ρ*

^{2})/

*p*], where

*σ*

^{2}

_{1}

*σ*

^{2}

_{2}

*ρ*is the correlation coefficient. The probability distribution of a sample covariance is approximately normal for

*p*> 30; it is chi-square for

*ρ*= ±1 (e.g., Mardia et al. 1979). Uncertainty for the lag-

*s*difference covariance matrix (12) can be computed by observing that

**D**

_{s}

**Y**

**Y**

*s*

Summary of notation.

Estimates of system error covariance matrix **Q****R****0****D̃**_{1}, with an average standard error of 18% as compared to 38% for **Ỹ**

Estimates of system and measurement error variance based on 14 months of simulated acoustic tomography data with **R****I****Q**

Estimates of system error covariance matrix **Q****R****0****Ỹ****D̃**_{1}, . . ., **D̃**_{5}, simultaneously.