## 1. Introduction

In the present study, the error subspace statistical estimation (ESSE) and optimal interpolation (OI) schemes are comparatively applied to Middle Atlantic Bight shelfbreak front simulations. To assess the ESSE capabilities, the ideal perfect model situation is chosen. The benchmarks employed are identical twin experiments. The OI scheme used is the standard operational method of the Harvard Ocean Prediction System (HOPS; e.g., Lozano et al. 1996; Robinson et al. 1996), optimized to the shelfbreak front situation. The ESSE filtering scheme utilized in these comparative experiments is described in the first part of this study. Both the OI and ESSE methods improve the forecast, which exemplifies the need for data assimilation, even in the ideal perfect model situation. After several assimilations, the OI is shown to retrieve 75% of the patterns of the simulated true ocean (section 6). ESSE, in accord with the error subspace convergence criterion employed (Lermusiaux and Robinson 1999, hereafter Part I) achieves a 95% retrieval. Other ESSE improvements, like the tracking of the error subspace nonlinear evolution and the dominant error forecast, are also exemplified. Considering costs, OI is cheaper than ESSE by a factor approximately equal to the size of the error subspace (ES) divided by the number of central processing units (CPUs) used in parallel. In the present application, this factor is of the order of 20. As was argued in Part I, ESSE can be employed to refine the a priori assumptions made in other reduced schemes, in this case OI. The preexercise investigation of the ESSE error vectors and weights could yield refinements of the OI weights and, hence, lead to a more robust and efficient assimilation system for subsequent rapid and sustained assessments in very large regions of complex dynamics.

The text is organized as follows. Section 2 summarizes the physical background. Section 3 discusses the numerical and physical parameters of the dynamical model. Section 4 describes the 39-day twin experiments utilized in this study. Section 5 deals with the OI and ESSE respective parameters. The comparative assimilations are analyzed in section 6. The ES nonlinear evolution is exemplified and the cost of ESSE is compared to that of several other data assimilation (DA) methods. The conclusions, with several of the ESSE advantages and properties, are given in section 7.

## 2. GFD background

The geophysical fluid dynamics (GFD) experiments used in this study to evaluate the ESSE concept and test the corresponding schemes (Part I) are idealized shelfbreak front simulations. The basic state of the idealized tilted fronts are created using a feature model (Sloan 1996) of observed Middle Atlantic Bight (MAB) shelfbreak front properties in a summer situation (e.g., Garvine et al. 1988). For the review and extensive simulations of the phenomena occurring in the MAB, we refer to Beardsley and Boicourt (1981) and Sloan (1996). The common dominant feature in the MAB consists of a temperature and salinity front, separating the shelf and deep ocean water masses. This front is often located above the shelf break. In the present feature model, the front separates the cold fresh shelf water (8.5°C ⩽ *T* ⩽ 15.5°C) to the north and the warm salty slope water to the south (13°C ⩽ *T* ⩽ 22°C). The geometry is simplified to a periodic channel, over a zonally uniform slopping topography. Since the focus is on the shelfbreak region, the shelf and deep slope regions are idealized to flat bottom boundaries (Fig. 1). The linear tilt in topography is in the opposite direction of the titled front and has a stabilizing effect (Sloan 1996). For representing summer conditions, the surface is exponentially stratified, down to 30 m. The basic state is in thermal wind balance, with the main flow east–west and zero flow at the bottom. It is a steady-state solution of the primitive equations (PE).

To create dynamical evolutions, two types of small perturbations are added initially to the basic steady state:one is in geostrophic balance, the other is a random white noise in space. For each PE field, the perturbation amplitude relative to the associated background field varies between 1% and 5%. Cross sections and surface horizontal maps of temperature and salinity in one such initial field are shown in Fig. 1. All scales are perturbed. An ensemble of initial conditions (IC) yields a varied spectrum of nonlinear dynamical evolutions. The specifics of each evolution are a function of the location and shape of the dominant instabilities that the perturbation has initiated. Projecting the ensemble of perturbations onto the initial optimal perturbations (OP) associated to a given time interval (e.g., Farrell and Moore 1992; Palmer 1993; Farrell and Ioannou 1996a,b), the amplitudes along each OP grow within that interval in proportion to the corresponding OP singular values. For some specific realizations, slowly growing OP can dominate the variability for a certain time if the initial perturbations projected strongly on these OP structures. On ensemble average, as the OP spectrum establishes itself, the nonlinear transfers of energy between growing perturbations and the basic initial state also modify the total field evolutions. Hence, the transfer matrix of the continued linear approximation of the nonlinear system evolves in time. The OP structures and spectrum are thus time dependent.

For data assimilation, an important feature of the present simulation is the across-shelf asymmetry characterized by the tilted density front, the stabilizing tilted steep topographic slope (0.16%), and the surface temperature stratification (Fig. 1). Preferred directions and locations exist and correlations among variations of variability (Part I) are thus expected to be inhomogeneous, anisotropic, and nonuniform.

## 3. Dynamical model, numerical and physical parameters

The HOPS nonlinear rigid-lid PE model, in the *f*-plane configuration, is used for dynamical forecast and the HOPS system for peripherals (Robinson et al. 1996). For confidence, several simulations of different physical (missing parameterizations, frontal width) and numerical parameters (domain sizes, horizontal and vertical grid resolutions, meridional extent) were tested with ESSE assimilation. All were successful. The parameters of the simulation discussed in this presentation are summarized in Table 1. They are similar to those of the example briefly discussed in Robinson et al. (1998).

The bottom depth linearly increases from 123.75 m on the shelf to 232.5 m in the deepest southern region (Fig. 1). The horizontal resolution of 2.5 km is chosen so as to resolve the large submesoscales. The levels are distributed vertically for optimum representation of the surface stratification, tilted front, and topography (Fig. 2). The total number of grid points is 41 × 46 × 19 = 35 834. The PE state vector ** ψ** comprises the temperature (

*T*), salinity (

*S*), internal velocity

*û, υ̂*)

*ψ*) fields. For high wavenumber filtering and mixing, a Shapiro filter (Shapiro 1970, 1971; Lermusiaux 1997) of order 8 is applied once every time step to the internal velocity and tracer fields (8–1–1). The time rate of change of barotropic vorticity is filtered with a Shapiro filter 2–3–1 (order 2, 3 times every time step). The associated effective diffusivities and viscosities (Lermusiaux 1997) are adequate for the MAB front (Sloan 1996). The boundary conditions are cyclic in the zonal direction. On the channel walls, the condition is “no slip” for velocities, “no flux” on tracers and constant barotropic vorticity. There is no surface forcing. The bottom stress is set to

*C*

_{d}|

**u**|

**u**, where

**u**is the total horizontal velocity and

*C*

_{d}= 2.5 × 10

^{−3}is the drag coefficient.

A thorough discussion on the numbers relevant to the shelfbreak front dynamics is given in Sloan (1996). For the assimilation, the essential nondimensional numbers are the internal Rossby radius of deformation, *R*_{d} = (1/*f*)[*g*(Δ*ρ*/*ρ*)*D*]^{1/2}, and the Rossby number, Ro = *U*/*fL* (e.g., Pedlosky 1987), where *f* is the Coriolis frequency, *g* the gravity, Δ*ρ*/*ρ* a characteristic density difference ratio over the vertical scale of motion *D,* and *U* a characteristic horizontal velocity over the horizontal scale of motion *L.* Because of the summer temperature stratification, the perturbed feature model (Figs. 1a,b) has two dominant horizontal scales *R*_{d} of local quasigeostrophic adjustment. One is associated with the main shelfbreak front, below the stratification. The other corresponds to the portion of this front that penetrates the bottom 10 m of the prescribed stratified layer. At these depths, the *T* stratification is not yet strong enough to dominate but is important enough to locally modify the orientation, strength, and extent of the front (Figs. 1a,b). One can assign a Δ*ρ*/*ρ* and a *D* to each of these vertically limited features. In numerical values, the corresponding “frontal *R*_{d}” is about *R*^{F}_{d}*R*_{d}” is about *R*^{S}_{d}

## 4. Identical twin experiment

For this first assessment of ESSE, an identical twin experiment is employed (Fig. 3). A numerical model simulation during a certain time interval is chosen to be the “true ocean” or control run. A subsampled dataset is then extracted from this simulation. Starting from ICs sufficiently independent from the simulated true ones, a second simulation is evolved, using the same dynamical model as for the simulated truth. This defines the “estimated ocean,” in which the subsampled data is assimilated. For each assimilation scheme utilized, such an estimated evolution is carried out, as if one was making real ocean observations in the true ocean. The purpose of such experiments is to analyze, in ideal exact dynamical model conditions, the quality of the true ocean retrieval as a function of the DA schemes used.

As described in Fig. 3a, the true ocean is a 39-day PE model run starting from the initial conditions *ψ*^{t}_{0}*ψ̂*^{f}_{0}*ψ̂*^{f}_{0}*ψ*^{t}_{0}*n*^{f}_{0}**v**_{k} = 0 (Part I). This choice is justified since for both the OI and ESSE, if the data noise characteristics are correct in the estimation criterion, the assimilation will succeed. For ultimate verification, a forecast is issued for day 39 and compared with the truth. In this text, only the experiment of Fig. 3 is discussed in detail. Other successful comparisons carried out involved simulated current data, and local instead of global assimilations.

It is important to assess the need for the assimilation of both temperature and salinity data. In the present MAB simulations, the tracers are usually strongly correlated spatially. With ESSE, observing one tracer can suffice; the evolving, complex multivariate ES covariances allow the correction of one tracer from the observation of the other. On average, assimilating both tracers instead of one via ESSE only improved the rms error by 5%. If, in advance, one knew the adequate error covariances, the same could be achieved with OI. However, for the variability of the tilted front, simple water mass models or classic OI cross-correlation functions have difficulties in giving accurate estimations of salinity from temperature. It is only to eliminate this OI issue that CTD, instead of expendable bathythermograph (XBT), casts are assimilated in the following comparisons. One objective of this work is, in fact, to show that, if the use of ESSE is too expensive on a sustained basis, the prestudy of the ESSE weights can refine the water-mass models employed in the OI schemes.

## 5. HOPS OI/ESSE parameters

The OI scheme employed is discussed in the appendix. It is that of the operational HOPS methodologies. The ESSE filtering scheme was described in the Part I of this study. The parameters specific to the present comparisons are now described.

For both methods, data are assimilated at observation times only. In the Harvard OI scheme, *g*(*t* − *t*_{k}) is thus chosen equal to a numerical Dirac function (see appendix). Since the true ocean is simulated, the OI handicaps were reduced by optimizing the OI parameters. The OI involves first an objective analysis (OA) of the sensor data, in two stages (Fig. A1, appendix): the synoptic/mesoscales are computed after the mapping of a large-scale, background field. For the present OA of tracer fields, the along-/across-shelf advection and mixing properties yield anisotropic correlation length scales. In the first-stage large-scale OA, the zonal and meridional decay scales employed are, respectively, *l*^{x}_{2}*l*^{y}_{2}*l*^{x}_{1}*l*^{y}_{1}*l*^{x}_{2}*l*^{y}_{2}*l*^{x}_{1}*l*^{y}_{1}*ψ* obtained assuming a uniform, flat reference level were not accurate enough to be assimilated. In the present OI, only the analyzed tracers and internal velocities are thus blended with the forecast. The barotropic transport *ψ* is affected by the data after the assimilation, via dynamical adjustments to the melded tracers and internal velocities. A more sophisticated OI involves primitive equation balancing of the OA transport prior to the assimilation (e.g., Lermusiaux 1997).

For the ESSE, the filtering scheme of Part I, Tables 3 and 5, is employed. The notation in this paper is as in Part I (see Part 1, appendix A). For example, the symbols (−) and (+) distinguish a priori and a posteriori quantities. The stochastic model errors are set to zero (**dw** = 0). The a posteriori ensemble initial conditions are created using Eq. (39c) (Part I). The singular value decomposition (SVD) of the ensemble spread is updated and the analysis of Burgers et al. (1998) reduced to its significant subspace. Here, the size of the ensemble evolves with time, in accord with data and dynamics, as dictated by the similarity coefficient [Part I, Eq. (B3)]. If during the assimilation this coefficient *ρ* requires an a posteriori ensemble size *p*_{k}(+) larger than the a priori one *p*_{k}(−), Eq. (39a) of Part I is used for the additional ICs. Note that in the present simulations, employing Part I, Eq. (39a), for all members also led to successful assimilations. In the data–dynamics melding (Part I, Table 3), the measurement errors are assumed to be uncorrelated, time invariant, and constant in the vertical with *r*^{T}_{ii}^{2} and *r*^{S}_{ii}^{2} (the diagonal elements of **R***ψ̂*^{cf}_{k+1}(−)

## 6. HOPS OI/ESSE comparisons

The ESSE and OI are now compared using the twin experiment of section 4. The schematic of the assimilations occurring at day 18 is shown in Fig. 3b. The ocean forecast *ψ̂*^{f}_{18}**d**_{18} is subsampled from the true ocean and melded with the forecast, via the OI or ESSE scheme. The schemes have their own error models, hence, their own melding weights and melded fields, *ψ̂*^{OI}_{18}*ψ̂*^{ESSE}_{18}

### a. Day 18

#### 1) Primitive equation fields

The level-5 (10–20 m) estimate of the ocean temperature forecast for day 18 is shown in Fig. 4a. Figure 4b shows the corresponding simulated true ocean *T* field from which the day-18 data were subsampled. Looking first at Figs. 4a,b, there are several energetic scales of variability, from submesoscale to mesoscale shelfbreak wave patterns and eddies. Note also that the shape and nature of the features that develop on the cold and warm sides of the front in general differ. For example, as seen in Figs. 4a and 4b, most of the surface eddies on day 18 are on the warm side. The ESSE organization of the ensemble of such simulations allows the study of the dominant statistics of these evolving features. For instance, the cold submesoscale to mesoscale eddies in the slope water develop at all times, but they first adjust around day 5 to day 15 by a combination of frontal instabilities, meridionally sheared advection, and diffusion. When an extrusion of surface slope water is large enough (*L* ≥ *R*^{F}_{d}

The OI retrieval of the true *T* is shown by Fig. 4c, the ESSE retrieval by Fig. 4d. The differences between Figs. 4c and 4a, and Figs. 4d and 4a, show the impacts of these DA methods, respectively. Considering Fig. 4d, the low forecast temperature variability in the western part of the front (Fig. 4a, west of 71°W) is appropriately increased. The 70.9°W cold surface baroclinic-submesoscale eddy (Fig. 4a) is displaced to the west and reshaped, toward the true cold eddy (C_{2} in Fig. 4b). The easternmost intrusion (I_{2} in Fig. 4b) of cold shelf water (12.6°C ⩽ *T* ⩽ 13.6°C) is well estimated, but the subsequent intrusion downstream is overestimated (Fig. 4d). The position and variable tightness of the front (13.6°C ⩽ *T* ⩽ 15.5°C) are accurately corrected. In the eastern- and westernmost portions of the cyclic domain, the forecast front (Fig. 4a) was too wide, while in the center of the domain, its tight portion was misinclined. The rms error and pattern correlation coefficient confirm these qualitative arguments as will be shown later.

Considering Fig. 4c, the anisotropic OI scheme has improved the *T* forecast. The dominant scales of the OI wave packets agree with the true ocean. At observation points, the forecast tracers are replaced by their OA values (appendix). Yet, by definition, the OI weights interpolate the observations with spatially uniform scales, in this case as ellipsoidal structures. For instance, the double-lobe cold intrusion into the slope water (*I*_{1}, *I*_{2} in Fig. 4b) is almost estimated as a uniform cold pool by OI. The southernmost part of this true intrusion is a cold eddy (C_{1} in Fig. 4b, at 38.7°N–71.4°W), which has a surface radius of 3.5–4 km smaller than the 10 km subsampling. Even though the corresponding ESSE eddy (Fig. 4d) is more accurate than the OI eddy (Fig. 4c), neither is a good estimate. The OI front is too wide at several locations and its position is not as good as the ESSE one. The OI weights do not address the nonhomogeneous, anisotropic, and nonuniform properties of the variations of frontal wave packets and cold eddies trapped in the surface (Fig. 4b). These properties are realization and time dependent. In the phase space, they are defined by the local shape of the variations of variability (Part I). Specifying a priori an OI scheme coping with such properties is difficult. In ESSE, the weights are naturally flow dependent. In fact, prestudies via ESSE could be employed to refine the classic operational schemes to application-specific, but flow-dependent, error models.

Figure 5 gives the same *T* panels as in Fig. 4 except that level 7 is considered. This level (20–35 m) is at the base of the stratification and is the depth at which the frontal wave packets of scales *L* ≥ *R*^{F}_{d}*R*^{S}_{d}_{1} in Fig. 5b) and the shelf water intrusion (I_{1} in Fig. 5b) is relatively well captured by ESSE even though it is at the limit of the observation array. The OI renders this interaction as the base of a dipole cold eddy (Fig. 5c). North of the front, the slight upwellings (*T* ⩽ 11°C) of deeper shelf water, parallel to bursts of slope water (*T* ∼ 12.5°C) into the shelf, are also better interpolated by ESSE (Fig. 5d) than by OI (Fig. 5c).

In conclusion, the OI improves the forecast features but some of the true physical characteristics and parameter ranges are modified (e.g., front position, eddy orientations, frontal width, and wave packet variability). ESSE conserves the statistics of the tracer and flow field variabilities while only correcting the most erroneous components of the forecast. An essential component of the ESSE scheme is thus its flow-dependent time-evolving multivariate error decomposition and error update (Part I). It is exemplified next.

#### 2) Primitive equation error subspace

The model is exact in this experiment (Fig. 3) and the day-18 error forecast is a pure predictability limit error (model error covariance **Q** = 0, measurement error covariance **R** = 0; see Part I, appendix A). At day 18, the a priori ES is the nonlinear extension of the day-18 optimal perturbation dominant spectrum. After that day, the stationary observation arrays reorder and modify this spectrum.

As explained in Part I (appendix B, section a), the forecast sample error fields are normalized by their volume and sample-averaged variance. For each PE variable, these norms *η* were for the day-18 forecast: *η*_{T} = 0.29°C, *η*_{S} = 0.086 psu, *η*_{û} = 1.04 cm s^{−1}, *η*_{υ̂}^{−1}, and *η*_{ψ} = 0.0168 Sv, respectively. These averaged variability amplitudes are characteristic of MAB shelfbreak front phenomena (Sloan 1996). As exemplified next, the normalization is numerically necessary but is also very useful physically. It determines the relative importance of variables in the error vectors. Figure 6a shows the history of the ES similarity coefficient *ρ* (Part I; appendix B, section b) during the parallel batches of perturbed forecasts for day 18. The chosen criterion limit of 97% was attained after 200 forecasts. Figure 6b shows the associated error covariance eigenvalue spectrum and Fig. 6c the cumulative spectrum. During the 39-day period, features and patterns of multiple scales unfold, grow, and dissipate, but the ES size using that same 97% convergence criterion remains within 190–250 forecasts. As the number of active scales increases, the number of energetic future scenarios tends to decrease, hence keeping a quasi-constant ES size. One reason for this is that the data array is statistically stationary: the same CTD pattern is assimilated every 3 days (section 4 and Fig. 2). The data force the ES dimension toward stationarity. Another consists of the evolution toward a quasi-turbulent regime, most likely associated with multiscale-attracting behaviors (Part I). As more scales become active, more of the dominant variability is locally constrained to specific directions of the state space. The ES then exploits this organized memory character of the nonlinear variability.

Figures 7 illustrates the first a priori error eigenvector that interestingly is associated with submesoscale to mesoscale eddy fields and baroclinic frontal oscillations trapped into the 30-m surface stratification (Figs. 7a and 1a), and to bottom-trapped frontal wave patterns (Fig. 7b). The middepth fields of smaller nondimensional amplitudes are not shown; by day 18, the corresponding scales (*R*^{F}_{d}*T* and *S* fields are strongly coupled, with common spatial phase and similar amplitudes. They could compensate and keep the density field unperturbed. The surface and bottom *υ̂**û* field (Fig. 7b), and *ψ* field (Fig. 7a) indicate that the total velocity contribution of the bottom-trapped patterns is larger than that of the surface patterns, in proportion to the vertical extent of the two processes. Figure 8 shows the vertical dependence of the temperature component of this first error vector. The amplitude decreases by a factor of 3 below the stratification (e.g., level 12), and progressively increases again in the bottom 40 m. It also exemplifies a common property of the dominant eigenvectors: the most energetic variations of variability are located along the tilted front (Sloan 1996). Figures 7 and 8 demonstrate that the dominant error variance is not always within the largest scales considered. The regions and phenomena of highest uncertainty are here determined by the evolving dynamics and data (Part I; Tables 3 and 5, with **dw** = 0). In the present study, a coarse grid scheme (e.g., Fukumori and Malanotte-Rizzoli 1995) would not resolve the first error eigenvector on day 18.

The second a priori error eigenvector, which accounts for 7% of the total error variance, is shown in Fig. 9. This vector is located along the tilted front and is quasi-barotropic: the shapes vary with depth, but the scales, sign, and amplitude of the perturbations are quasi-uniform in the vertical, for all four PE volume variables. The scales, 10 km in the eastern region to 20 km in the center and western regions (periods of 60–120 km), are larger than in the first vector (Figs. 7, 8). The second vector explains mesoscale instabilities growing on the main tilted front, with scales related to *R*^{F}_{d}*T* field on level 7 indicates that surface perturbations can be forced at the bottom of the stratification by energetic patterns of scales ≥ *R*^{F}_{d}

Figure 10 illustrates the a posteriori reorganization of the ES at day 18 [the columns of **E**_{18}(+)]. The normalized temperature component of the first a posteriori error vector is shown in Fig. 10a, the second in Fig. 10b, both at levels 1, 7, 12, and 19. Comparing Fig. 10 with Figs. 8 and 9, the assimilation of the CTD array (Fig. 2) has modified the error structure and amplitude within the ES. The dominant patterns are now in the submesoscale to mesoscale fields (Fig. 10). The first temperature component (Fig. 10a) corresponds to submesoscale to mesoscale oscillations of the surface front and to submesoscale shelfbreak front patterns. The second component (Fig. 10b) explains a submesoscale eddy field within the surface stratification, again with shelfbreak front patterns below. These vectors, columns of **E**_{18}(+), are combined next using (38) and (39c) of Part I to determine the a posteriori ensemble of fields used as ICs in the error forecast to day 21 (section 5).

Figure 11 is utilized to evaluate the ESSE error estimates with respect to the actual errors on day 18. For conciseness, temperature is the only variable shown, but the following facts apply to all PE variables. Figure 11a shows the real a priori error fields, at levels 1, 7, 12, and 19. These fields are differences between the true ocean and forecast temperature states on day 18 (e.g., Figs. 4a,b and 5a,b). Structure-wise, the similarity with the dominant eigenvectors of the forecast ES on day 18 (Figs. 8, 9) is striking, at all depths. The ESSE error variance forecast [diagonal of **P**^{p}_{18}(−),**P**^{p}_{18}(+),

### b. Day 18 to day 39

From day 18 to day 39, several perturbations of different scales develop and nonlinearly interact together and with the basic state. The local Rossby number increases (section 3) and the flow tends to a turbulent state. Such phenomena were analyzed in detail by Sloan (1996). Via ESSE, the nonlinear evolution of the growing/decaying wave patterns and scales is continuously organized according to variance. The smallest 5–10-km scales (*R*^{S}_{d}*R*^{F}_{d}

#### 1) Forecast and melded PE fields for day 33

Since the start of the experiment, five OI/ESSE assimilations have occurred (Fig. 3). Focusing on the locations at which new scales of variability have unfolded most during days 18 to 33, middepth levels are illustrated [the evolution on the levels of Figs. 4 and 5 is considered in section 6b(3)]. Figure 12a shows the level-12 temperature (50–90 m, Fig. 2b) of the simulated true ocean on day 33. Below the stratification, frontal wave patterns of 40–70-km periods have grown. These middepth patterns were not present on day 18; their length scale is of *R*^{F}_{d}*T* ∼ 10°C), separated by two shelf water extrusions. A subsurface lens of slope water is detaching east of 71°W (Fig. 12a). Similar phenomena have been observed in the real ocean (e.g., Sloan 1996). Note that on day 33, most of the subsurface lenses are warm, developing in the cold side, which is another example of the frontal variability asymmetries of the present feature model [sections 2 and 6a(1)]. On the cold side, parcels that appear to be detached are often just meanders of the front, with a sheared vortex-tubelike structure. Figure 12b gives the OI forecast for day 33, issued after five OI assimilations from day 18 to day 30 (Fig. 3). Figure 12c shows the same, but for the ESSE. As will be confirmed by rms errors and pattern correlation coefficient (PCC) measures [section 6b(4)], the ESSE forecast is better than the OI one. Figures 12d and 12e show the *T* field on day 33 after OI and ESSE assimilation, respectively. The differences between these panels and the corresponding forecast panels (Figs. 12b,c) show the assimilation effects. Note again the strong correlation between the true ocean and ESSE fields: the three slope water intrusions with a detaching warm water lens around 39.15°N–70.9°W (*L*_{1} in Fig. 12a), and the tightness and meandering position of the front are better estimated by ESSE in Figs. 12c and 12e than by OI in Figs. 12b and 12d.

#### 2) Error subspace forecast for day 33

Figure 13 illustrates the dominant vectors of the ES forecast for day 33, which was used in the ESSE melding (Figs. 12c,e). Comparisons with Figs. 7–9 exemplify the nonlinear evolution of the ES. The dominant ES forecasts for day 18 and day 33 are quite different. Surface stratification and middepth levels, where the most changes have occurred, are shown. The bottom components on level 19 are similar to those of Figs. 7–9, but with more scales. The first error vector forecast for day 33 (Fig. 13a) accounts for uncertainties at the extremity of the main slope water intrusion, and hence to an eventual slope water detachment (Fig. 12). The second vector (Fig. 13b) is a perturbation of the meridionally elongated slope water intrusion at the edges of the cyclic domain. Several dominant error vectors always have a signature along these edges since the data resolution is lower there (12.5 km). The third vector (Fig. 13b) explains wave patterns and oscillations along the now-meandering shelfbreak front; these patterns are similar to those of the first vector for day 18 (Figs. 7 and 8). Yet they have distinct locations and vertical structures since they are associated with the nonlinearly evolved basic state of day 33 (Fig. 12), which is different than that of day 18 (Figs. 4 and 5). The statistics of dominant error vectors is nonstationary. Finally, the fourth vector in Fig. 13d (de)couples the second and third vectors.

#### 3) Field forecast for day 39

After the seventh assimilation on day 36, the OI and ESSE scheme issue a forecast for day 39. They are compared in Figs. 14–17. To verify the OI and ESSE total velocity forecasts, Fig. 14 shows the barotropic transport streamfunction *ψ* overlaid with the surface velocity vectors. The ESSE (Fig. 14c) immediately corrects total velocities in accord with the multivariate ES forecast and CTD-forecast residual profiles. The present OI (Fig. 14b) assimilates velocities assumed in geostrophic balance with the analyzed tracers and waits for the PE to dynamically adjust its fields (section 5). As was argued in Part I, section 5b, the multivariate ESSE assimilation, in agreement with the evolving dynamics, yields a better velocity forecast, at all scales. Figures 15–17, respectively, compare the temperature forecast, at the surface (Fig. 15), at the base of the stratification (Fig. 16), and at middepth within the front (Fig. 17). In contrast with Figs. 4 and 5, the simulated true ocean is now quasi-turbulent (Figs. 15a, 16a, 17a). The surface eddies and submesoscale to mesoscale patterns (Fig. 15), the intense plumes and filaments with internal upwelling at the base of the stratification (Fig. 16), the collapsing by nonlinear mixing of the day-33 slope water intrusion (Fig. 12a) into a tight quasi-zonal shelfbreak front, and the slope water patches are all better forecast by ESSE (Figs. 15c, 16c, and 17c) than by OI (Figs. 15b, 16b, and 17b).

#### 4) Pattern correlation coefficient and root-mean-square-error evolutions

**, the dimension of which varies with the spatial extension of the field (e.g. surface, cross section or volume field), the field PCC and rmse are, respectively, defined by**

*ψ*

*ψ*^{b}denotes a background or climatological field vector,

*ψ*^{t}

_{k}

*ψ̂*_{k}

_{2}the vector ℓ

_{2}norm. In the present experiment, the PCC and rmse for all PE fields should tend toward their ideal values, respectively, one and zero, if the assimilation of the subsampled tracer array suffices to control the predictability errors. Figures 18a and 18b compare the 39-day evolution of the PCC and rmse of the ESSE and a posteriori OI estimates for the zonal velocity volume field

**u**. The background in (1) was set to the basic initial state zonal velocity. On day 18, the forecast and true ocean have a PCC for

**u**only equal to 43% (Fig. 18a). On that day, the first array of CTD casts (Figs. 2, 3) is assimilated and the OI/ESSE curves start to differ. For the ESSE, both the forecast and after-melding values are given (jagged curve). Up to day 36, only the OI-melded values are shown. At day 39, both the OI and ESSE values are forecasts. From day 27 onward, the PCC for the melded ESSE

**u**stays between 93% and 96%. This is close to the 97% limit chosen for the ES similarity coefficient

*ρ*(Part I; appendix B, section b). The ESSE forecasts for these dates all have a PCC higher than 90%, except for day 33 (83%). The ESSE final forecast on day 39 has a PCC of 95%. From day 27 onward, the PCC for the melded OI

**u**stays between 70% and 82%. The OI forecast on day 39 has a PCC of 70%. Similar comments can be made for the rmse curves. On average during the simulation, the melded ESSE

**u**is measured by the PCC and rmse to be 40% better than the OI

**u**. With the rmse measure, the ESSE forecast for day 39 is 54% better than the OI one; with the PCC measure, it is 34% better. For both the OI and ESSE, whether the subsequent PCC and rmse increase or decrease depends on the state. Even though the four realizations between day 27 and day 36 of the melded ESSE and OI PCCs fluctuate around 95% and 75%, respectively, a longer integration is needed to show that, within some range, the PCCs have in fact stabilized.

### c. Timings

The cost of ESSE is now briefly compared to that of other methods, using a benchmark of a 3-day forecast with one assimilation (Table 2). The size of the dynamical and measurement models employed in the present study, the associated assimilation parameters, and the elapsed time for the main computations involved are first stated. The state and data vector sizes are representative of a realistic at-sea experiment. With one UNIX Sparc 20 CPU, one forecast took 20 min, one melding 20 min, and the SVD of an error sample matrix of 210 members 30 min. The HOPS OI took 40 min all together. The dominant orders of the numerical floating point operations involved in the Kalman filter (KF), direct Monte Carlo or ensemble Kalman filter (EnKF), present ESSE, HOPS OI, Kalman smoother (KS), representers, and adjoint methods (e.g., Robinson et al., 1998) are given. Elapsed times are then computed for each scheme.

For all methods, the cost driver in a state forecast is approximately of *n**s*), where *n* is the state vector size and *s* the number of time steps. For the present ESSE (Part I, section 6), the number of floating point operations decomposes into (a) the ensemble forecasts, *q* × *n**s*); (b) rank-*p* SVD, *n**pq*); (c) state melding [Part I, Eqs. (17), (18)], *n**pm* + *m*^{3}/6); and (d) ES update [Part I; Eqs. (27), (28)], *mp*^{2} + *p*^{3} + *np*^{2}). The term *m*^{3}/6 accounts for the Choleski predecomposition of **R** since sequential processing of observations is used (Part I; appendix B, section e). For the EnKF (Evensen and van Leeuwen 1996), one obtains similarly (a) the ensemble forecast, *q* × *n**s*); (b) ensemble gain, *n**qm* + *m*^{3} + *nm*^{2}); and (c) *q* back-substitutions [Part I, Eq. (20a)], *n**mq*). In the EnKF, the computation of the a posteriori ES covariance, carried out in ESSE, is not counted since it is not evaluated. In all terms above, *n* is commonly several orders of magnitude larger than *p, q,* and *m*; while *p* ⩽ *q, q* and *m* can be of similar magnitudes. The leading order is thus often that of the terms containing *n*. The error forecast hence commonly dominates and its is the only cost considered in the other schemes of Table 2. For example, assuming that an adjoint method (or any other gradient descents) converges in *n*_{i} iterations, its dominant order is *n*_{i} × *n**s*).

The timing comparisons are divided into two benchmark groups. The first employs a single workstation, with four CPUs; it represents a common at-sea situation of one Harvard workstation assigned to ocean field estimation. The second states the elapsed time for the CPUs that were actually used in this study: they consisted of a parallel network of 15 slower CPUs. The adjoint method was assumed to require *n*_{i} = 200 iterations. In passing, the four filters can be adjusted to the three smoothers by multiplying their cost by 2. The classic full covariance methods (e.g., KF, KS) take more than a year for the 3-day benchmark. They cannot be used. The ESSE and EnKF methods require from 12.5 h to a bit more than a day. They can be used in real-time operations (e.g., Lermusiaux 1997). For this experiment, the adjoint method and direct representer techniques are more expensive: they need 2.8–19 days for the chosen benchmark. This confirms that the representers are most advantageous when the total number of measurements is low (smaller than the size of the ES) and when an a posteriori error estimate is not required (Robinson et al. 1998). Assuming one does not need an a posteriori error estimate, the cost of the representer method can be reduced by employing a preconditioned iteration technique (e.g., Bennett et al. 1996). Another direction that could resolve both the cost and a posteriori error issues is to combine the ES ideas with the representer approach. This was suggested in the smoothing techniques developed in Part I of this study.

## 7. Conclusions

ESSE was applied to an identical twin experiment for evaluating its capabilities in an ideal perfect model situation. The 39-day nonlinear evolution of the simulated true ocean, from a state in quasi-stationary thermal wind balance to a state in a quasi-turbulent regime was well captured. Truncating the error covariances to their converged most “energetic” low-dimension subspace did not lead to field divergence. The improvements over the OI scheme were demonstrated, both qualitatively and quantitatively.

The essential components of the present scheme, the flow-dependent, time-evolving ES and the multivariate minimum error variance assimilation in that subspace, were exemplified. In this study, the subspace where the most energetic multiscale errors occurred was successfully tracked and organized. The SVD of the normalized dominant error covariance estimate was shown to facilitate the physical understanding and study of the dominant errors. For instance, the a priori and a posteriori dominant ES estimates were compared and the data influence on the dominant error covariance was analyzed. The ESSE error covariance estimates were validated against the real errors. The cost of ESSE was contrasted with that of common DA methods. It was shown suitable for real-time applications at sea with today’s computers.

In general, depending on the ocean problem of interest, the present approach can quantitatively validate other reduced methods, that is, determine the state-space location of the dominant errors in specific conditions. Analyzing the results of ESSE simulations is also useful to refine the a priori assumptions made in approximate schemes, in this case the OI. Utilizing ESSE in observation system simulation experiments is promising. Such preexercise investigation of the dominant errors can tailor the OI weights to a specific situation. Combining OI and ESSE can yield efficient assimilation systems for rapid and sustained assessments of very large, complex ocean regions.

Several properties specific to the MAB simulation are exemplified. For instance, the statistics of dominant error vectors are observed to be nonstationary. These vectors are modified in harmony with the evolution of the nonlinear dynamics and measurements. They can be anisotropic and/or inhomogeneous, barotropic and/or baroclinic. In general, these properties vary with the initial conditions, internal dynamics, and data type and coverage. Finally, the formation of surface shelf water eddies, the subsurface lenses of slope water, the asymmetries in the frontal variability, and the dominant surface–bottom coupling mechanism in the present MAB simulation are analyzed via ESSE. This demonstrates that ESSE is a powerful tool, capable of continuously organizing the multivariate 3D variabilities in accord with their relative variance.

## Acknowledgments

I am especially thankful to Dr. N. Q. Sloan III, for providing me with the shelfbreak front feature model and for his interest and numerous helpful discussions. I am grateful to Professor A. R. Robinson for his insights and critical comments. I am very indebted to Professor Donald G. Anderson, Professor Andrew F. Bennett, Professor Roger W. Brockett, and Professor Brian F. Farrell, members of my dissertation committee, for their challenging encouragements. I also benefited greatly from several members of the Harvard oceanography group, past and present. I thank Mr. Michael Landes and Mr. Todd Alcock for helping in the preparation of some figures. I am grateful to two anonymous referees for their excellent reviews. This study was supported in part by the Office of Naval Research under Grant N00014-90-J-1612 to Harvard University.

## REFERENCES

Beardsley, R. C., and W. C. Boicourt, 1981: On estuarine and continental-shelf circulation in the Middle Atlantic Bight.

*Evolution of Physical Oceanography: Scientific Surveys in Honor of Henry Stommel,*B. Warren and G. Wunsch, Eds., The MIT Press, 198–233.Bennett, A. F., 1992: Inverse methods in physical oceanography.

*Cambridge Monographs on Mechanics and Applied Mathematics,*Cambridge University Press, 346 pp.——, B. S. Chua, and L. M. Leslie, 1996: Generalized inversion of a global numerical weather prediction model.

*Meteor. Atmos. Phys.,***60,**165–178.Bretherton, F. P., R. E. Davis, and C. B. Fandry, 1976: A technique for objective analysis and design of oceanographic experiments applied to MODE-73.

*Deep Sea Res.,***23,**539–582.Burgers, G., P. J. van Leeuwen, and G. Evensen, 1998: On the analysis scheme in the ensemble Kalman filter.

*Mon. Wea. Rev.,***126,**1719–1724.Carter, E. F., and A. R. Robinson, 1987: Analysis models for the estimation of oceanic fields.

*J. Atmos. Oceanic Technol.,***4,**49–74.Evensen, G., and P. J. van Leeuwen, 1996: Assimilation of Geosat altimeter data for the Agulhas Current using the ensemble Kalman filter with a quasigeostrophic model.

*Mon. Wea. Rev.,***124,**85–96.Farrell, B. F., and A. M. Moore, 1992: An adjoint method for obtaining the most rapidly growing perturbation to the oceanic flows.

*J. Phys. Oceanogr.,***22,**338–349.——, and P. J. Ioannou, 1996a: Generalized stability theory. Part I: Autonomous operators.

*J. Atmos. Sci.,***53,**2025–2040.——, and ——, 1996b: Generalized stability theory. Part II: Nonautonomous operators.

*J. Atmos. Sci.,***53,**2041–2053.Fukumori, I., and P. Malanotte-Rizzoli, 1995: An approximate Kalman filter for ocean data assimilation: An example with one idealized Gulf Stream model.

*J. Geophys. Res.,***100,**6777–6793.Garvine, R. W., K.-C. Wong, G. G. Gawarkiewicz, R. K. McCarthy, R. W. Houghton, and F. Aikman III, 1988: The morphology of shelfbreak eddies.

*J. Geophys. Res.,***93,**15 593–15 607.Lermusiaux, P. F. J., 1997: Error subspace data assimilation methods for ocean field estimation: Theory, validation and applications. Ph.D. thesis, Harvard University, Cambridge, MA, 402 pp.

——, and A. R. Robinson, 1999: Data assimilation via error subspace statistical estimation. Part I: Theory and schemes.

*Mon. Wea. Rev.,***127,**1385–1407.Lozano, C. J., A. R. Robinson, H. G. Arango, A. Gangopadhyay, N. Q. Sloan, P. J. Haley, and W. G. Leslie, 1996: An interdisciplinary ocean prediction system: Assimilation strategies and structured data models.

*Modern Appropaches to Data Assimilation in Ocean Modelling,*P. Malanotte-Rizzoli, Ed., Elsevier Oceanography Series, Elsevier Science, 413–452.Palmer, T. N., 1993: Extended-range atmospheric prediction and the Lorenz mode.

*Bull. Amer. Meteor. Soc.,***74,**49–65.Parrish, D. F., and S. E. Cohn, 1985: A Kalman filter for a two-dimensional shallow-water model: Formulation and preliminary experiments. Office Note 304, NOAA/NWS/NMC, 64 pp.

Pedlosky, J., 1987.

*Geophysical Fluid Dynamics.*2d ed. Springer-Verlag, 71 pp.Robinson, A. R., 1996: Physical processes, field estimation and an approach to interdisciplinary ocean modeling.

*Earth-Sci. Rev.,***40,**3–54.——, H. G. Arango, A. Warn-Varnas, W. G. Leslie, A. J. Miller, P. J. Halcy, and C. J. Lozano, 1996: Real-time regional forecasting.

*Modern Approaches to Data Assimilation in Ocean Modeling,*P. Malanotte-Rizzoli, Ed., Elsevier Science, 455 pp.——, P. F. J. Lermusiaux, and N. Q. Sloan III, 1998: Data Assimilation. The Sea: The Global Coastal Ocean I,

*Processes and Methods,*K. H. Brink and A.R. Robinson, Eds., Vol. 10, John Wiley and Sons, 541–594.Shapiro, R., 1970: Smoothing, filtering, and boundary effects.

*Rev. Geophys. Space Phys.,***8**(2), 359–387.——, 1971: The use of linear filtering as a parametrization of atmospheric diffusion.

*J. Atmos. Sci.,***28,**523–531.Sloan, N. Q., 1996: Dynamics of a shelf/slope front: Process studies and data-driven simulations. Ph.D. thesis, Harvard University, Cambridge, MA, 230 pp.

## APPENDIX

### HOPS Optimal Interpolation (OI)

The data–forecast melding step of the HOPS OI (e.g., Lozano et al. 1996) consists of a *two-scale objective analysis* (OA) of the observations, followed by a *blending of the forecast and OA fields.* For simplicity, the index *k* is omitted.

#### Objective analysis (OA)

**x**the vector of model gridpoint locations, by

**X**the vector of measurement locations, and by

*ψ**OA estimate*of a univariate and statistically uniform field using a notation similar to that of Bennett (1992), is defined by,

*ψ̂*^{OA}=

*ψ***COR**

**x**,

**X**)[

**COR**

**X**,

**X**) +

**]**

*R*^{−1}[

**d**−

**d**

**P**

**COR**

**X**,

**X**), and at grid–data points,

**COR**

**x**,

**X**). The vector

**d**is the data vector, the matrix

**R**

**X**, and

**d**

**X**. For horizontal OAs (A1a)–(A1b), the error fields are assumed to have zero vertical correlations.

**x**

_{i}and

**X**

_{j}denoting the horizontal locations of a grid and data point, respectively, the elements

*i, j*of

**COR**

**x**,

**X**) are

**L**

_{1}= diag(

*l*

^{x}

_{1}

*l*

^{y}

_{1}

^{2}contains the zonal and meridional zero crossing length scales and

**L**

_{2}= diag(

*l*

^{x}

_{2}

*l*

^{y}

_{2}

*e*-folding decay scales. The scalar

*τ*is the decorrelation timescale and Δ

*t*the interval between the time of the observation and of the estimate. The data error correlation matrix at data points is chosen diagonal, with uniform nondimensional variance

*ϵ*

^{2}, hence

**R**

*ϵ*

^{2}

**I**

*e*-folding spatial decays, zero crossings, and time decay. The background field for this first step is the data horizontal average. In the second stage, the dominant dynamics of interest (e.g., mesoscale) is gridded using its estimated space–time decays, the background being the first-stage OA. The main assumption made in this two-scale OA is that the errors in the largest (first-stage) and most energetic (second-stage) dynamical scales are statistically independent. In practice, this can smoothly meld different data types (e.g., synoptic observations and climatology). The quasigeostrophic streamfunction

*ψ*

_{QG}(dynamic height), obtained by integration of the hydrostatic equation up and down from a chosen flat level of reference

*z*

_{ref},

*T, S*profile and objectively analyzed in two stages prior to assimilation. In (A4),

*g*is gravity, the pressure is scaled to a streamfunction via

*ψ*

_{QG}=

*fρ*

_{0}

*p,*and

*ρ*is the density obtained from the equation of state. The internal mode velocities and barotropic transport streamfunction are then computed assuming geostrophic balance and a rigid lid (Lermusiaux 1997).

#### Blending

*blending*of the forecast with the OA-gridded field (Fig. A1) is defined by

**(+) =**

*ψ̂***Λ**

*ψ̂*^{OA}+ (

**I**

**Λ**)

**(−)**

*ψ̂***Λ**∈ [fy9,1]R

^{n×n}contains the blending coefficients,

*ϵ*

^{OA}

_{i}

*i,*and

*ϵ*

^{OA}

_{max}

*ϵ*

^{OA}

_{min}

*ϵ*

^{OA}

_{i}

*ϵ*

^{OA}

_{i}

*g*(

*t*−

*t*

_{k}) (e.g., a negatively skewed Gaussian centered on

*t*

_{k}) then empirically reduces the weight of the OA fields as a function of the lag

*t*−

*t*

_{k}. Equations (A1)–(A6) define the HOPS OI scheme.

Central run parameters.

Timings (elapsed time, I/O included).