## 1. Introduction

Predictability is an active area of research in the study of geophysical fluids not only for its scientific and technological challenges but also for its great applicability in practical problems. A notable example of how research developments in this area yield social and economic benefits is the field of numerical weather prediction (NWP), where modern skillful forecasts are the result from the combination of many observations from different in situ and remote sensing platforms with accurate, dynamically complex, atmospheric models using advanced data assimilation methods. Thanks to this synergy of methodologies, it is now possible to predict the weather several days in advance with an acceptable degree of accuracy and employ such predictions for a host of useful applications.

Prediction of the ocean circulation and its associated weather is no less interesting and challenging. The advent of satellite technologies that reveal the ocean mesoscale and the increasing deployment of in situ instruments along with improvement in the realism of ocean model simulations have made it possible for us to envision ocean weather prediction systems that can parallel their atmospheric counterparts in realism and societal relevance. Ocean data assimilation has matured to the point that model reanalyses and forecasts are produced routinely by several oceanographic centers. An excellent review of the current state of such assimilation systems within the framework of the Global Ocean Data Assimilation Experiment (GODAE) is presented by Cummings et al. (2009). Approaches to data assimilation vary widely both in terms of sophistication of the method and the observations assimilated. This includes multivariate and ensemble optimal interpolation (Cummings 2005; Oke et al. 2008), ensemble Kalman filter (Fukumori 2002; Evensen 2006), and multivariate three-dimensional variational data assimilation (3DVAR; Fujii and Kamachi 2003; Weaver et al. 2005, hereafter W05). For an excellent review of the operational system in the Mediterranean Sea and its applications, the reader is referred to a special issue of *Ocean Science* (2010, Vol. 6; available online at http://www.ocean-sci.net/special_issue6.html). The system assimilates satellite and in situ data based in an optimal interpolation scheme, which uses empirical orthogonal functions (EOFs) to project vertically the satellite data (Dobricic et al. 2005). The main observations assimilated in all these systems are: sea level anomalies from altimeter data, subsurface temperature and salinity data from Argo floats and coastal gliders and moored and drifting buoys, expendable bathythermograph (XBT) temperature, conductivity–temperature–depth (CTD) data, in situ and satellite sea surface temperature (SST), and satellite sea ice concentration. Crucial to all these systems is the way the background error covariances are specified. These covariances determine how information is spread from the observations to the model grid points at all vertical levels. Gaussian or second-order autoregressive functions are used in many systems with anisotropic and inhomogeneous horizontal Gaussian decorrelation scales (Fujii and Kamachi 2003) and vertically coupled temperature–salinity empirical orthogonal functions modes (Dobricic et al. 2005). For a real-time application in the shelf and shelf break, the reader is referred to Lermusiaux et al. (2010).

Along these lines of progress, the Regional Ocean Modeling System (ROMS) has recently been augmented with a set of tools based on variational methods that use a tangent linear version of the nonlinear ROMS and its associated adjoint operator and can be used, for example, to elucidate the most relevant patterns and properties of the associated dynamical operator^{1} as described by the generalized stability theory (Farrell and Ioannou 1996a,b), for sensitivity studies (Moore et al. 2004, 2009), and for strong- and weak-constraint data assimilation (e.g., Di Lorenzo et al. 2007; Moore et al. 2011a,b,c).

In this work, we use the incremental, strong-constraint, four-dimensional variational data assimilation (IS4DVAR) driver of ROMS. Both 3DVAR and 4DVAR are used routinely in oceanography. Incremental 3DVAR and 4DVAR methodologies are used in the Océan Parallélisé (OPA) model as described by Weaver et al. (2003), and Stammer et al. (2002) have applied a 4DVAR approach using the Massachusetts Institute of Technology general circulation model (MITgcm) in the framework of the Estimating the Circulation and Climate of the Ocean (ECCO) project. A variant of 4DVAR based on the method of representers has also been applied to ocean models (Bennett et al. 2000; Kurapov et al. 2009).

IS4DVAR finds corrections or increments to the initial conditions of each analysis window in time, to yield the solution of the model equations that best fits, in the least squares sense, all observations available within a specified spatial domain for the chosen time interval. Because the model equations are satisfied exactly over the assimilation interval the model physics are said to enter as a strong constraint on the solution. IS4DVAR is a smoothing algorithm (the model trajectory tries to fit all the observations simultaneously over the given interval of time) and the optimal hindcast represents the best linear unbiased estimate (BLUE) of the circulation. It will therefore find the maximum of the joint-over-time probability density function (PDF). Strictly speaking, this does not correspond to the optimal state for forecast in a nonlinear system (which is the marginal PDF at the end of the interval). However, as long as the tangent linear assumption is valid, both PDFs will be reasonably similar and the final state at the end of the analysis interval can be used as an initial condition for a subsequent forecast.

In this work, we explore to what extent the ROMS IS4DVAR data assimilation algorithm can improve ocean state estimation and prediction in the East Australian Current (EAC). This southward-flowing western boundary current separates from the coast of Australia at around 32°S latitude, flowing eastward into the South Pacific, creating one of the most dynamically interesting regions of the World Ocean (Godfrey et al. 1980). Extending from the Coral Sea to the Tasman Sea, the EAC system generates strong mesoscale variability as it branches into the Tasman Front (Tiburg et al. 2001). Although most of the EAC flows eastward at this separation point, as much as ⅓ of the original EAC flows southward trapped against the western boundary (Ridgway and Godfrey 1997). Tiburg et al. (2001) found that a simple 1.5-layer linear model can produce the main features of the circulation, including the location of the EAC separation, which is associated with strong gradients of the zonally integrated wind stress curl field. The origin of the strong mesoscale variability seems to be associated with several factors that include nonlinear instabilities and upper-ocean–topographic coupling (Tiburg et al. 2001), oscillation and propagation of the Tasman Front, and formation of cold-core frontal eddies and baroclinic Rossby waves (Marchesiello and Middleton 2000). The spatial and temporal variability of the EAC was characterized using 6 yr of altimeter-derived geostrophic streamfunction and velocities derived from tracking thermal features in satellite imagery by Bowen et al. (2005), revealing cyclones and anticyclones propagating southward and westward with periods between 90 and 180 days. They found no evidence of mesoscale signals propagating westward from the South Pacific basin, consistent with the notion that mesoscale variability is generated by intrinsic instabilities. An alternative view is offered by Mata et al. (2006), who also based on altimeter data and an ocean model suggest that the formation of large separation eddies is preceded by the southward propagation of sea level anomalies along the continental slope that grow via barotropic and baroclinic instability.

The EAC is an attractive region for evaluating the ROMS strong-constraint data assimilation scheme because it is characterized by strong mesoscale variability (Stammer 1997; Mata et al. 2000) that seems to be generated principally by internal instabilities (Bowen et al. 2005), making the area an ideal scenario for the study of an assimilation system where the control variables are the initial conditions. We will show that the sea surface height (SSH) and SST estimates provided by satellites, if correctly incorporated into the assimilation system, can be used to provide skillful forecast of the three-dimensional circulation up to 2 weeks into the future using ROMS. To provide a robust evaluation of the assimilation system, we study 2 yr of mesoscale ocean circulation in the EAC (sections 2). The observations used for assimilation and validation during the 2-yr period are described in section 3, and the IS4DVAR assimilation algorithm in ROMS is summarized in section 4. The assimilation and forecast experiments for the EAC are described in section 5 and discussed and summarized in section 6.

## 2. The EAC ROMS application

The numerical ocean model used here is ROMS (http://www.myroms.org), a three-dimensional numerical ocean circulation model developed principally for regional applications (Haidvogel et al. 2000, 2008). It solves the Reynolds-averaged Navier–Stokes equations using the hydrostatic and Boussinesq approximations. Momentum, scalar advection, and diffusive processes are represented using transport equations, whereas the vertical velocity is diagnosed using volume conservation. Shchepetkin and McWilliams (2003, 2005) describe in detail the algorithms that comprise the ROMS computational kernel.

The model domain extends from the Australian coast to 162°E and from 26° to 46°S (Fig. 2). The spatial resolution used in this study was ¼° (18–30 km) in the horizontal and 30 levels in terrain-following coordinates in the vertical, with increased resolution toward the surface (average resolution of 1.7 m). We consider 2 yr of simulation–assimilation (2001 and 2002) where ROMS was forced at the boundaries with output from the data-assimilative U.S. Navy Coastal Ocean Model (NCOM). At the boundaries, temperature and salinity were clamped to the NCOM solution and the sea level and depth-integrated velocities were imposed following Chapman (1985) and Flather (1976) formulations, respectively, that permit surface gravity waves to radiate out of the model domain. Air–sea heat and momentum fluxes were prescribed from daily averaged U.S. Navy Operational Global Atmospheric Prediction System (NOGAPS) analyses (Rasmond 1992), and the vertical turbulent mixing closure was the K-profile parameterization scheme (Large et al. 1994). No tidal forcing is imposed.

A higher-resolution version (⅛°) of this ROMS application was used by Wilkin and Zhang (2007, hereafter WZ07) to describe the evolution of the dominant modes of mesoscale variability in the EAC retroflection area. The model reproduced the main characteristics of the mean circulation when compared with hydrography and long-term current meter mooring deployments. Although the mesoscale variability did not match the temporal evolution of the observations, they found their model reproduced the spatial patterns and spectral characteristics of the dominant modes of mesoscale variance. Because of the high cost involved in variational data assimilation systems, in this work we use a lower-resolution version of this model and concentrate in fitting the temporal evolution of the mesoscale variability via IS4DVAR to evaluate the effect of the assimilation of satellite data on prediction skill.

To test the stability of our model and its water mass preservation, we ran the model for 30 yr, cycling the surface forcing and open boundary conditions for 2001 and 2002. We computed the mean temperature–salinity (*T*–*S*) diagrams for the same areas reported by Tomczak (1981) using hydrographic data, and we found that the forward model reproduces and preserves the overall characteristics of the observed water masses during the 30 yr of integration. As an example Fig. 1a shows the mean *T*–*S* diagram at the point of maximum mesoscale variability in the model for the last 2 decades of integration. Even though the boundary and surface forcing is repeated every 2 yr during this long integration, the time evolution of the mesoscale variability was different every year but the statistics were stationary, as reflected by the level of variance and structure of the vertical modes from one decade to the other. Figure 1b shows the time evolution of the SSH in a grid point of maximum mesoscale variability during the last 2 decades. The level of variance is the same, but the correlation between the two time series (−0.15) is statistically not different from zero (the average number of eddies per year is 2.5). The vertical EOFs of salinity and temperature for the mesoscale variability do not change from one decade to the next (Figs. 1c,d).

As reported by WZ07, the dominant mode of mesoscale variability, which is represented well in the model, originates at the boundary current separation (between 32° and 35°S) and exhibits an eddy-like structure with a southwestward propagation along the coast. Although this phenomenology is realistic, the temporal evolution of the model mesoscale variability does not correspond very well to the observed variability on an eventwise basis when the model is forced by the best estimate of surface and boundary forcing for the 2 yr of simulation. To illustrate this point, Fig. 2c shows the temporal correlation of the 7-day running mean of the model SSH anomalies with the 7-day composite of Archiving, Validation, and Interpretation of Satellite Oceanographic data (AVISO) altimeter reanalysis during the 2-yr period considered. High correlation (around 0.8) is observed at the boundaries, but this decreases toward the interior, notably over the area of maximum observed mesoscale variability (Fig. 2a). This is consistent with the finding of WZ07, where a similar run forced at the boundaries with climatology produced reasonable spatial patterns of mesoscale variability, suggesting that internal instabilities dominate the generation of mesoscale variance in the area. The decrease of correlation toward the interior in Fig. 2 therefore likely arises because of internal instabilities of the nonlinear ocean dynamics and not by contamination from the boundary forcing formulation. If this is the case, then assimilation methodologies where the control variable is the initial condition (e.g., IS4DVAR and the weak-constraint data assimilation) should be valuable in improving eventwise agreement in the evolution of modeled mesoscale features.

Distribution of SSH mesoscale variability in the EAC during 2001 and 2002 from (a) the AVISO analysis and (b) the forward ROMS application. (c) The spatial distribution of correlation between EAC ROMS SSH hindcast and AVISO SSH analysis and (d) the RMS of the difference between the two during the 2-yr period is also presented.

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

Distribution of SSH mesoscale variability in the EAC during 2001 and 2002 from (a) the AVISO analysis and (b) the forward ROMS application. (c) The spatial distribution of correlation between EAC ROMS SSH hindcast and AVISO SSH analysis and (d) the RMS of the difference between the two during the 2-yr period is also presented.

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

Distribution of SSH mesoscale variability in the EAC during 2001 and 2002 from (a) the AVISO analysis and (b) the forward ROMS application. (c) The spatial distribution of correlation between EAC ROMS SSH hindcast and AVISO SSH analysis and (d) the RMS of the difference between the two during the 2-yr period is also presented.

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

## 3. Observations

### a. SSH anomalies

The SSH anomaly observations assimilated were obtained from the AVISO analysis of multiple altimeter satellite missions that maps along-track SSH data to a regular grid, with the anomaly computed with respect to a 7-yr mean, considering up to four satellites at a given time. Sampling and long wavelength errors determination are improved, but quality of the series is not homogeneous. The use of several satellites allows for crossover minimization of the orbital error (Le Traon and Ogor 1998) and at the same time significantly improves the estimation of mesoscale variability, with best resolution starting when using four satellites (Pacual et al. 2006). To produce the gridded data, an optimal interpolation scheme with correlation functions is used to produce a 7-day composite. Details about the corrections applied to the raw altimeter data and about the projection on a regular grid can be found in AVISO (1996). For our assimilation, the ¼ × ¼ AVISO mapped SSH anomaly was linearly interpolated onto the model grid points and the mean SSH from the 2-yr model run was added. This is a reasonable procedure because it is not our intent to forecast long-term, large-scale changes in the SSH.

In a more sophisticated system, one would assimilate along-track SSH anomalies and let the system act as a dynamical interpolator in space and time. However, an analysis of the forward run revealed that for the EAC the SSH variability is quite isotropic, except very close to the coast and shelf break. This is consistent with an analysis of along-track altimeter data in this region (Wilkin et al. 2002) and the AVISO optimal interpolation methodology. It will be shown in section 5 that our assimilation system performs successfully using gridded spatial maps of SSH, suggesting that using this gridded product is an acceptable choice for formulating a prediction system for the area. The extent to which the system performance might further improve using along-track data directly is the subject of ongoing work.

### b. SST

For SST, we used 1-day composites of the Advanced Very High Resolution Radiometer (AVHRR). We applied a spatial median filter of ¼° in zonal and meridional directions to map the data to the model grid, which had the effect of removing some small cloud gaps. Regions of extensive cloud simply leave gaps in the assimilation data where SST does not contribute the IS4DVAR cost function. We preferred to use 1-day composites for SST because the 4DVAR cost function explicitly considers the time at which the observations were made during the analysis interval. This preserves time variability resolved by the data; that is, it makes no assumptions about the temporal decorrelation time scale of SST as is done in the case of the AVISO gridded SSH data. The time scale of SST variability in the EAC region is quite short, on the order of 12 days (Walker and Wilkin 1998), so it provides valuable extra information about near-surface changes in the dynamics in shorter space–time scales than the SSH (e.g., submesoscale surface open-ocean currents). Figure 3d shows the percentage of coverage of ocean points in the daily SST maps after applying the spatial median filter for the two years considered. On average, 60% of the ocean points are observed by the satellite on any given day, with a few days with coverage as low as 10%. In the examples in Fig. 4, we show how the assimilation analysis effectively de-clouds the SST while preserving day-to-day variations present in the AVHRR signal.

Temporal distribution of the observations used in this study during (a) 2001 and (b) 2002. The thin gray lines in (a) and (b) indicate the edges of each 7-day AW. (c) The spatial distribution of AVISO SSH RMS (shading) and XBT transects (white dots) is shown. (d) The distribution of spatial coverage of the 1-day AVHRR SST composites during the same 2-yr period is shown.

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

Temporal distribution of the observations used in this study during (a) 2001 and (b) 2002. The thin gray lines in (a) and (b) indicate the edges of each 7-day AW. (c) The spatial distribution of AVISO SSH RMS (shading) and XBT transects (white dots) is shown. (d) The distribution of spatial coverage of the 1-day AVHRR SST composites during the same 2-yr period is shown.

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

Temporal distribution of the observations used in this study during (a) 2001 and (b) 2002. The thin gray lines in (a) and (b) indicate the edges of each 7-day AW. (c) The spatial distribution of AVISO SSH RMS (shading) and XBT transects (white dots) is shown. (d) The distribution of spatial coverage of the 1-day AVHRR SST composites during the same 2-yr period is shown.

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

A typical example of the ROMS fit using IS4DVAR after assimilation of AVISO SSH (contours) and AVHRR SST (color). (a)–(c) The observations at three different days over the 7-day AW. AVISO SSH observations are provided (a) at the beginning and (c) at the end of the AW, whereas SST (with some gaps due to clouds and some seemingly erroneous data where cloud contaminated pixels are misidentified as ocean) is available every day. (d)–(f) The model counterpart (analysis) at the observation time is presented.

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

A typical example of the ROMS fit using IS4DVAR after assimilation of AVISO SSH (contours) and AVHRR SST (color). (a)–(c) The observations at three different days over the 7-day AW. AVISO SSH observations are provided (a) at the beginning and (c) at the end of the AW, whereas SST (with some gaps due to clouds and some seemingly erroneous data where cloud contaminated pixels are misidentified as ocean) is available every day. (d)–(f) The model counterpart (analysis) at the observation time is presented.

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

A typical example of the ROMS fit using IS4DVAR after assimilation of AVISO SSH (contours) and AVHRR SST (color). (a)–(c) The observations at three different days over the 7-day AW. AVISO SSH observations are provided (a) at the beginning and (c) at the end of the AW, whereas SST (with some gaps due to clouds and some seemingly erroneous data where cloud contaminated pixels are misidentified as ocean) is available every day. (d)–(f) The model counterpart (analysis) at the observation time is presented.

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

### c. XBT

The only in situ observations considered in this study were temperature profiles from high-resolution XBTs. The XBT transects considered come from the Scripps Institution of Oceanography Ship of Opportunity (SOOP) XBT network site (line PX34 section from Sydney, Australia, to Wellington, New Zealand), and XBT data from some additional research cruises kindly provided by Australia’s Commonwealth Scientific and Industrial Research Organisation (CSIRO). The position of all the XBT profiles is shown in Fig. 3c (white dots), and the time when the observations were taken is shown in Figs. 3a,b. The transects are well distributed in time with an average time separation of 1–3 months, and they sample the area of maximum observed mesoscale variability as evidenced by the AVISO RMS in Fig. 2c (shades). The vertical resolution of the XBT data used was 10 m.

### d. SynCTD

To better project the satellite information to the subsurface we will show that a pseudo-observation, which we refer to as synthetic CTD (SynCTD) is useful. The SynCTD exploits the observed tight empirical relationship between SSH and SST anomalies and contemporaneous in situ profiles of temperature and salinity anomalies. Historical Argo profiles and SSH and SST fields were used as a training set to derive a quantitative empirical vertical relationship between surface properties and subsurface *T* and *S*. Two methods are used that depend heavily on the CSIRO Atlas of Regional Sea (CARS), an accurate high-resolution temperature and salinity climatology (Ridgway et al. 2002; Dunn and Ridgway 2002). North of 50°S, a set of multiple linear regression relations are obtained between the surface and temperature and salinity at each standard depth level. A space–time version of locally weighted least squares or “loess mapping” is used to generate a smooth 2D field of regression coefficients at each level from the nonuniform spatial distribution of ocean casts. The data are smoothed in space by projecting onto spatial quadratic functions and simultaneously being fitted by annual and semiannual harmonic components. Fitting the spatial and temporal components in a single step minimizes the temporal bias in the mean (Ridgway et al. 2002).

The relationships are derived as anomalies with respect to the seasonal CARS climatology and the SSH and SST anomalies. South of about 45°S, a variant of the gravest empirical of the gravest empirical mode (GEM) method is applied (Sun and Watts 2001); this area, however, is outside of our model domain. A similar approach is the feature oriented strategies (Gangopadhyay and Robinson 2002) in which the interface region of two water masses can be modeled by melding available synoptic datasets (SSH and SST in this case) with parametric vertical functions for the hydrographic fields, with the function parameters derived from historical in situ data.

The SynCTD system has been run in hindcast mode for the period 1993–2004 using delayed mode gridded altimetry fields and composite satellite SST, and the final product merges Argo profiles with the synthetic fields at each depth using optimal interpolation into a regular grid with a spatial resolution of

## 4. IS4DVAR

This section briefly summarizes the general aspects of IS4DVAR and its implementation in ROMS. The variational data assimilation procedure operates by finding the minimum of a cost function representing the misfit between the model solution and observations subject to the condition that the solution obtained must satisfy exactly the equations defining the numerical model.

Let **x**(*t*) = {**u**, **v**, **T**, **S**, ** ζ**} be the column state vector of ROMS composed of the concatenation of velocities (

**u**,

**v**), temperature (

**T**), salinity (

**S**), and SSH (

**) vectors at all model grid point at a given time. We wish to combine the state vector**

*ζ***x**with all the observations available in a given period of time

*t*∈ (

*t*

_{i}_{−1},

*t*) to produce an analysis

_{i}**x**

*(*

^{a}*t*) given a prior or background estimate

**x**

*(*

^{b}*t*) of

**x**. In the experiments here

**x**

*is obtained from a forward ROMS run initialized from a previous analysis. Because*

^{b}**x**

*(*

^{b}*t*) will be close in some sense to the optimum state we are trying to estimate, it is convenient to formulate the problem in terms of increments or perturbations

*δ*

**x**to the background field,

**x**

*=*

^{a}**x**

*+*

^{b}*δ*

**x**.

The time evolution of the state vector is given by **x**(*t*) = *M*(*t*, *t _{i}*

_{−1})[

**x**(

*t*

_{i}_{−1})], where

*M*is the nonlinear propagator that advances the initial condition

**x**(

*t*

_{i}_{−1}) in time. A first-order expansion of the analysis

**x**

*(*

^{a}*t*) =

*M*(

*t*,

*t*

_{i}_{−1})[

**x**

*(*

^{b}*t*

_{i}_{−1}) +

*δ*

**x**(

*t*

_{i}_{−1})] is given by

**x**

*(*

^{a}*t*) =

**x**

*(*

^{b}*t*) +

*t*,

*t*

_{i}_{−1})

*δ*

**x**(

*t*

_{i}_{−1}), where

*M*about the nonlinear trajectory

**x**

*(*

^{b}*t*) or the tangent linear model of ROMS.

*δ*

**x**(

*t*) is uniquely determined by

*δ*

**x**(

*t*

_{i}_{−1}). The incremental formulation of Courtier et al. (1994) minimizes the quadratic cost function,

*n*is the number of observations, and the information provided by the observations

*t*is introduced through an innovation vector

_{i}*H*is a (possibly nonlinear) observation operator and

*H*is simply a spatial interpolation because the observed variables are elements of the state vector. However,

*H*could represent a more complicated formulation, such as that extracting the projection of model surface velocities along the radial paths observed by high-frequency coastal radars or the transfer function between an observable biological variable and the state vector. The first term

*J*in (1) measures the total variation of the initial condition, which is weighted in the dot product by the inverse of the background error covariance matrix

_{b}*J*

_{o}in (1) measures the difference between the observations and model predictions in its incremental form, which is weighted by the inverse of the observation error covariance

*. This matrix is assumed here to be diagonal (i.e., observation errors are assumed to be spatially uncorrelated) and time independent, with a nominal error of 3 cm for the SSH and 1°C for the SST.*

_{i}The numerical model is assumed to be adequate to describe the observed ocean circulation physics given suitable initial conditions, and this is often referred to as strong-constraint data assimilation Sasaki (1970) because it does not formally acknowledge inadequacies in the model (which would be a weak-constraint formulation). In practice, IS4DVAR accommodates shortcomings in the modeled dynamics in the relative magnitude of background and observational error covariances. Imposing these strong constraints on the minimization using undetermined Lagrange multipliers results in an additional set of equations governing the multipliers (the adjoint variables) that must be solved simultaneously with the model equations. This new set of equations is the discrete adjoint of the numerical model (Le Dimet and Talagrand 1986; Talagrand and Courtier 1987). In practice, this coupled set of equations is solved in an iterative procedure that exploits the fact that the solution of the adjoint model at time *t _{i}*

_{−1}gives the gradient of

*J*

_{o}with respect to changes in the initial conditions when forced by the weighted misfit

*δ*

**x**in the direction that minimizes the total cost function

*J*.

The advantage of defining the minimization of problem in terms of increments is that the resulting *J* is a quadratic function of the control variable *δ***x**, and therefore we deal with a simplified problem that is guaranteed to have a unique minimum, as opposed to the nonlinear formulation with potentially many minima. As suggested by Courtier et al. (1994), a practical approach to account for nonlinearities while retaining the practical advantages of the quadratic minimization problem is to allow the basic state trajectory used in the tangent linear and adjoint models to be regularly updated with the most recent state estimate of the minimization of the quadratic *J*. These nonlinear updates are performed in the so-called outer loops, whereas the iterative quadratic minimizations are found within the so-called inner loops. When data assimilation is applied repeatedly in a sequence of consecutive analysis intervals, as will be described in section 5, the first guess is obtained from the previous assimilation cycle and is therefore reasonably close to the optimal estimate. Experimentation suggests an optimal value of three outer and five inner loops for the present application.

An important component in the minimization problem is the specification of the background covariance matrix *t _{i}*

_{−1}from the original background and to act as a regularization term by spatially smoothing the updated initial conditions. For a detailed description of the formulation as implemented in ROMS, the reader is referred to Powell et al. (2008) and Moore et al. (2011a). The initial increments are scaled by the specification of a standard deviation file of all state variables reflecting the statistics of physically realizable increments. As noted above, the EAC ROMS application produces reliable means and seasonal variability but fails to produce observed mesoscale events. To establish bounds on plausible mesoscale increments, the standard deviations were computed from the anomalies around the mean seasonal cycle from a 30-yr forward run. The length scales chosen for the background covariance were 100 km in the horizontal and 150 m in the vertical directions, being representative spatial decorrelation scales of the mesoscale anomalies from the same forward run. The horizontal scales used are also in agreement with the decorrelation scales estimated from along-track satellite for the area by Wilkin et al. (2002). A more detailed discussion about the limitations of using a univariate block diagonal matrix

Figures 4d,e shows an example of the IS4DVAR ROMS solution during the analysis interval after minimization of *J* for the SSH and SST data in Figs. 4a–c. The example corresponds to an assimilation cycle of experiment 1 (Exp1) described in the next section.

## 5. Assimilation and forecast experiments

The previous section showed an example of how IS4DVAR can be used to provide an estimate of the state of the ocean over an interval of time where some observations are available. This improved estimate, known as reanalysis, can be used in retrospective diagnostic studies of ocean circulation and dynamics and can aid analysis of coupled systems such as ocean ecology. Such reanalysis time series are widely used in meteorology and climate analysis because the imposition of dynamical constraints from a model have been recognized as an improvement upon interpolation of observations based solely on statistical methods. In this section, we explore how predictability of the EAC mesoscale improves given IS4DVAR-based reanalysis initial conditions.

The IS4DVAR algorithm can be applied successively to produce consecutive estimates and forecasts at regular intervals during the 2-yr period considered. In this mode, one assimilates the observations during a given interval of time, known as the assimilation window (AW), starting from a first guess (e.g., a climatology or a previous forecast). Provided the tangent linear assumption remains valid, the model solution at the end of the AW represents an improved estimate of the ocean state given the observations from the recent past subject to the constraint of modeled physics. It is therefore used as a first guess of initial conditions for the next assimilation cycle, during which more observations become available. It is hoped then that by repeatedly applying this procedure the information assimilated before the beginning of the next AW will be carried forth in time to subsequent assimilation cycles.

The length of the AW has to be chosen based on several considerations. First, it cannot be larger than the interval over which the tangent linear assumption is valid. Second, it cannot be so short that insufficient observations are available to influence the adjoint solution or, more importantly, that there is insufficient time for the adjoint model to propagate information from observed variables to unobserved ones.

The time over which the tangent linear assumption is valid was estimated by comparing the difference between tangent linear model and nonlinear integrations for an ensemble of slightly different initial conditions. The initial perturbations in all cases were chosen to maximize the energy norm growth following the formalism of Farrell and Ioannou (1996a). It was found that on average the tangent linear assumption was valid for about 10 days. To estimate the minimum window required by the adjoint model, we performed several IS4DVAR experiments where a single synthetic observation of SSH was assimilated (located within the zone of maximum mesoscale variability) for different AW and looking at the subsurface signal (**v**, **T**, **S**) of the increment after minimization. It was found that the adjoint model in this application produce geostrophically balanced increments in subsurface fields for AWs as short as 1 day.

In the experiments reported here, we used an AW of 7 days, during which the available observations are (see Fig. 3) two maps of AVISO SSH (at the beginning and at the end of the AW), seven maps of SST (but with cloud gaps), and possibly in situ subsurface temperature from some XBTs. In the cases where pseudo-observations from synthetic CTD are considered (see section 3d), in addition there are two estimates of temperature and salinity.

### a. Exp1 (SSH + SST)

The first experimental 2-yr reanalysis assimilated SSH and SST; the XBT data were used to evaluate the quality of the adjusted subsurface field. Figure 5 shows the correlation and RMS between the AVISO SSH and the model state upon convergence of the IS4DVAR (called here the analysis). Comparing Fig. 2 (model skill without assimilation) with Fig. 5 (model skill with assimilation), the improvement is clear. The spatial distribution and size of the mesoscale variability is now comparable to that of AVISO (Fig. 5a versus Fig. 5b), and the correlations in the interior have been substantially increased from zero in the forward model without assimilation (Fig. 2c) to ~0.9 in the analysis, notably over the region of maximum mesoscale variability (Fig. 5a). The RMS error has decreased in the same regions (cf. Fig. 2d with Fig. 5d).

As in Fig. 2, but for ROMS after sequential IS4DVAR assimilation of AVISO SSH and 1-day composites of SST.

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

As in Fig. 2, but for ROMS after sequential IS4DVAR assimilation of AVISO SSH and 1-day composites of SST.

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

As in Fig. 2, but for ROMS after sequential IS4DVAR assimilation of AVISO SSH and 1-day composites of SST.

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

It should be noted that this is only a consistency check and does not fully validate the assimilation. Common approaches to validation of mesoscale assimilation systems are to withhold subsets of the assimilated observations (SSH and SST in this case) or to compare with independent observations of the state variable being assimilated. Here, for example, we might seek temperature from surface drifters that we have not endeavored to include in the assimilation dataset. Although these approaches are valid, they do not provide a measure of the skill of the system to project surface satellite observations information into regions that are unobserved (i.e., the subsurface ocean). We therefore use the XBT observations as a validation set for this aspect of our experimental system. To derive statistically robust measures of skill all the available XBT data were binned by depth using a bin width of 100 m. This led to enough observations per bin that robust statistics could be computed (the average number of observations over the 2-yr period per bin was ~2600 with maximum of ~3500 for depths between 0 and 100 m and minimum of ~2000 for in the deepest bins). Then, the model counterpart of the observed temperature within each bin was determined by interpolation in space and time daily averages of the corresponding ROMS analysis to the position and time of the observations.

The model skill is evaluated in terms of the correlation, RMS difference, and mean bias between observation and model within each bin. Figure 6 compares the skill in nowcasting the subsurface XBT data for the forward model without assimilation (dashed–dotted lines) and for the analysis (dashed lines). Although there is a clear improvement in the top 100 m (correlation increases from 0.7 to 0.85 and RMS is reduced from 1.7° to 1.2°C), the skill improvement of the assimilation configuration is very modest, almost comparable to the forward model at all depths. Therefore, this assimilation system configuration is doing a poor job in projecting the surface information to the subsurface. This will be discussed in section 5c.

Skill of ROMS in nowcasting all the available XBT data. Skill is presented in terms of (a) correlation between temperature observed by the XBT and that predicted by the model, (a) RMS of the difference between the two, and (c) mean bias within depth bins of 100-m width (thin horizontal dashed lines). Four different ROMS estimates are presented: forward model (no assimilation; dotted–dashed lines), Exp1 (assimilation of SSH and SST; dashed lines), Exp2 (assimilation of SSH, SST, and XBT; dotted lines), and Exp3 (assimilation of SSH, SST, and SynCTD; solid lines).

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

Skill of ROMS in nowcasting all the available XBT data. Skill is presented in terms of (a) correlation between temperature observed by the XBT and that predicted by the model, (a) RMS of the difference between the two, and (c) mean bias within depth bins of 100-m width (thin horizontal dashed lines). Four different ROMS estimates are presented: forward model (no assimilation; dotted–dashed lines), Exp1 (assimilation of SSH and SST; dashed lines), Exp2 (assimilation of SSH, SST, and XBT; dotted lines), and Exp3 (assimilation of SSH, SST, and SynCTD; solid lines).

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

Skill of ROMS in nowcasting all the available XBT data. Skill is presented in terms of (a) correlation between temperature observed by the XBT and that predicted by the model, (a) RMS of the difference between the two, and (c) mean bias within depth bins of 100-m width (thin horizontal dashed lines). Four different ROMS estimates are presented: forward model (no assimilation; dotted–dashed lines), Exp1 (assimilation of SSH and SST; dashed lines), Exp2 (assimilation of SSH, SST, and XBT; dotted lines), and Exp3 (assimilation of SSH, SST, and SynCTD; solid lines).

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

### b. Exp2 (SST + SSH + XBT)

To have a benchmark against which to evaluate further improvements to the system, another experiment was performed in which the XBT data were also assimilated. The error assigned to the XBT observations was 1°C, which is at the high end of the expected range of the XBT instrumental errors. However, this value includes an inflation factor that includes scales and dynamical processes that appear in the observations but cannot be represented by the model. Figure 6 also shows the skill measures for this experiment (dotted lines). Although strictly speaking this comparison is not a validation of the assimilation system because the XBT data used to quantify the skill were also assimilated, the results represent a subsurface analysis that an improved projection of the satellite data should approach. For the IS4DVAR configuration, we have adopted the model dynamics can fit the XBT temperature data very well with correlations larger than 0.8 in the upper 700 m and RMS within 1°C and zero bias at all depths. Decreasing the assumed error in the XBT observations will reduce the RMS but adversely affect the quality of the fit to the surface satellite data; we therefore consider this the best combination of errors for Exp2.

### c. Exp3 (SSH + SST + SynCTD)

The deficiency of the assimilation system Exp1 in nowcasting the subsurface temperature is likely related to incomplete modeling of the background error covariance

Lermusiaux et al. (2000) and Lermusiaux (2002) describe an interesting extension to the SynCTD approach used here for building multiscale multivariate covariances. In this approach, the already computed vertical EOFs of historical temperature and salinity are merged with horizontal modes using Kronecker product properties to build 3D hydrographic fields. Then geostrophy and other reduced dynamic balances are used to populate the corresponding velocity covariances similar to W05. The nonobserved scales are then found using an ensemble of adjustment dynamical integrations, building the nonobserved covariances in statistical accord with the observed ones. The multiscale character of the scheme makes it attractive for areas where mesoscale variability coexists with mesoscale variability, such as the EAC.

W05 described a method to incorporate multivariate constraints in IS4DVAR by designing a coordinate transformation from model space **x** to a new control space where variables can be considered mutually uncorrelated (and therefore the use of a block-diagonal **x** that accounts for high cross-correlations among variables or the balanced component of **x**. In the W05 formulation, it is identified as increments of temperature and salinity that are in geostrophic balance with the velocity and pressure fields, which is generally an excellent approximation in most deep-ocean regions beyond the continental shelf where quasigeostrophic dynamics dominate the mesoscale variability, as is the case for the EAC eddy field.

The balance approach of W05 is a capability currently being tested in ROMS that was not available to us at the time this work was conducted. However, we can take advantage of the already existing synthetic CTD relationship described in section 3d to improve the inversion and quantify its impact on the predictability. We treat the SynCTD as data and use them to augment the set of observations that enter in the ROMS IS4DVAR cost function. Because the coefficients of the SynCTD subsurface projection were estimated from historical in situ observations and do not require updating, it is feasible to consider using these data in a real-time operational forecast system for the EAC.

Figure 7 shows a typical example of how different the vertical projections can be between Exp1 and Exp3 and compares these ROMS predictions with the contemporaneous nonassimilated XBT transect (the model values were interpolated in space and time to the XBT observations points). Although both inversions are associated with the same SSH structure (Figs. 7c,e) and both compare well to the AVISO SSH (Fig. 7a), the subsurface projections are very different with Exp3 (Fig. 7f), producing a subsurface estimate that remarkably resembles the vertical structure of temperature observed by the XBT (Figs. 7b,d,f).

Comparison of (a),(b) temperature observed by an XBT transect and that predicted by (c),(d) Exp1 and (e),(f) Exp2. In both ROMS predictions, the SSH observations shown in (a) are part of the data stream in the assimilation system and the fit to SSH [(c) and (e)] is similar. However, the vertical structure is different. The position of the different XBT soundings is shown in (a) with white dots.

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

Comparison of (a),(b) temperature observed by an XBT transect and that predicted by (c),(d) Exp1 and (e),(f) Exp2. In both ROMS predictions, the SSH observations shown in (a) are part of the data stream in the assimilation system and the fit to SSH [(c) and (e)] is similar. However, the vertical structure is different. The position of the different XBT soundings is shown in (a) with white dots.

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

Comparison of (a),(b) temperature observed by an XBT transect and that predicted by (c),(d) Exp1 and (e),(f) Exp2. In both ROMS predictions, the SSH observations shown in (a) are part of the data stream in the assimilation system and the fit to SSH [(c) and (e)] is similar. However, the vertical structure is different. The position of the different XBT soundings is shown in (a) with white dots.

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

The skill of Exp3 is almost equivalent to the skill of Exp2, which represents our benchmark for improvement, having explicitly enforced a fit to the subsurface XBT data that are used in the skill assessment. Correlations are high (~0.8) and comparable to Exp2 in all depths, with an RMS error slightly higher than Exp2 between 200 and 400 m (1°C versus 1.3°C) and a comparable mean bias (less than 0.5°C at all depths).

## 6. Quantifying predictability of EAC mesoscale variability

We have shown in the previous section that correct subsurface projection of surface satellite data (SSH and SST) can produce subsurface estimates of unassimilated XBT data almost as good as the experiment when the XBT data themselves are assimilated. We note also that the subsurface temperature and salinity estimates were derived from empirical relations between surface and subsurface variables based on historical data that do not include the 2001–02 XBT data used here for assimilation or validation. A great advantage of using the SynCTD data is that they have the same spatial coverage as the satellite data, which is clearly much more extensive that can be achieved with routine XBT transects. We therefore analyze in this section the extent to which the improved three-dimensional methodology of Exp3 (i.e., incorporating SynCTD data) can result in improved forecast of the mesoscale variability.

Figure 8 shows the temporal variability in nowcasting–forecasting the AVISO SSH over the two years considered. The abscissa is the time at which any given analysis interval commences, and the ordinate represents the duration into the future for the forecast of that date initiated from previous analysis intervals. The evolution through time of the skill of any given forecast is therefore along a line with unit slope. The AVISO SSH data are used to evaluate the forecast skill, which is valid because, being in forecast mode, the SSH observations have not yet been assimilated. The skill is quantified in terms of pattern correlation (Fig. 8a) and RMS error (Fig. 8b) between the model at a given forecast time and the corresponding AVISO SSH. Within the analysis or assimilation window (negative lags) pattern correlations are ~1 and RMS is very small most of the time, as we would expect, because this is when the model is fit to the observations. For increasing forecast times, the correlations start to drop and RMS error starts to increase, but acceptable correlations (>0.6) and relatively small RMS errors are observed to persist for at least 10 days.

Temporal variability in nowcasting–forecasting the AVISO SSH during 2001 and 2002. The abscissa shows the time of the year when ROMS is initialized and the ordinate shows the forecast lag (lags < 0 are within the AW, and lags > 0 are in the forecast window The skill is summarized here in terms of (top) spatial pattern correlation between AVISO and ROMS SSH anomalies and (bottom) the RMS of the difference between the two SSH patterns.

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

Temporal variability in nowcasting–forecasting the AVISO SSH during 2001 and 2002. The abscissa shows the time of the year when ROMS is initialized and the ordinate shows the forecast lag (lags < 0 are within the AW, and lags > 0 are in the forecast window The skill is summarized here in terms of (top) spatial pattern correlation between AVISO and ROMS SSH anomalies and (bottom) the RMS of the difference between the two SSH patterns.

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

Temporal variability in nowcasting–forecasting the AVISO SSH during 2001 and 2002. The abscissa shows the time of the year when ROMS is initialized and the ordinate shows the forecast lag (lags < 0 are within the AW, and lags > 0 are in the forecast window The skill is summarized here in terms of (top) spatial pattern correlation between AVISO and ROMS SSH anomalies and (bottom) the RMS of the difference between the two SSH patterns.

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

There are a few time periods when the IS4DVAR algorithm seems to have problems in fitting the SSH: namely, around days 150, 210, and 380. At these times, the correlation is relatively low and RMS error is high at all forecast lags, irrespective of the whether the initial conditions were fit well or poorly during the assimilation window. The reason for this is unclear. To our knowledge, these times are not associated with any anomaly in the quality of the AVISO SSH data, with these being derived from an observing system with a fixed sampling interval and stable error characteristics. We have no means by which to assess whether the surface forcing or boundary conditions might be in error, and by using a 7-day assimilation window we believe the discrepancy between the marginal versus joint-over-time PDF is minor. The fact that a 30-yr integration cycling 2 yr the same boundary and surface forcing produces different mesoscale variability (see Fig. 1b) suggests that some model states are very sensitive to small errors in the initial conditions. It is conceivable that these times represent intrinsically unpredictable mesoscale states or conditions that this model configuration has difficulty predicting because of issues of resolution neglected dynamical processes. This is a topic for future analysis.

Figure 9 shows Exp3 forecast skill for subsurface temperature, also in terms of correlation and RMS error, computed with respect to the unassimilated XBT data. The volume of XBT data is not sufficient to resolve variation in the model performance over time and so is presented as ensemble statistics for the entire 2-yr period. The solid line is the analysis skill and is the same as shown in Fig. 6 for Exp3. As time advances into the forecast period, the correlation falls and RMS error grows as was the case for the comparison to SSH. The relative change in skill over time is independent of depth, with acceptable correlations (>0.6) persisting for about 2 weeks. The magnitude of the RMS error has a subsurface maximum at 300-m depth because temperature variance itself is greatest there, centered on the permanent thermocline.

Skill of ROMS Exp3 in forecasting all the available XBT data. Skill is presented in terms of (a) correlation between temperature observed by the XBT and that predicted by the model, (b) RMS of the difference between the two and (c) mean bias within depth bins of 100-m width (thin horizontal dashed lines). The different lags of the forecasts (in weeks) for (a)–(c) are indicated in the legend in the bottom of (b).

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

Skill of ROMS Exp3 in forecasting all the available XBT data. Skill is presented in terms of (a) correlation between temperature observed by the XBT and that predicted by the model, (b) RMS of the difference between the two and (c) mean bias within depth bins of 100-m width (thin horizontal dashed lines). The different lags of the forecasts (in weeks) for (a)–(c) are indicated in the legend in the bottom of (b).

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

Skill of ROMS Exp3 in forecasting all the available XBT data. Skill is presented in terms of (a) correlation between temperature observed by the XBT and that predicted by the model, (b) RMS of the difference between the two and (c) mean bias within depth bins of 100-m width (thin horizontal dashed lines). The different lags of the forecasts (in weeks) for (a)–(c) are indicated in the legend in the bottom of (b).

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

## 7. Discussion and summary

In this work we have documented a variational data assimilation system for the East Australian Current (EAC). The system uses the ROMS ocean model and its strong-constraint data assimilation algorithm (IS4DVAR) to find the optimal initial conditions that best fit, in the least squares sense, all the available observations within a time period subject to the constraint that the model dynamics is satisfied exactly in the analysis interval. Several studies (including this one) suggest that mesoscale variability in the EAC is predominantly the result of local flow instabilities rather than remote forcing, which justifies using the initial conditions as the control variables in the assimilation system.

The observations considered are gridded AVISO SSH from satellite altimetry, 1-day AVHRR satellite SST, and subsurface temperature from XBTs when available. We applied IS4DVAR in a sequence of consecutive analysis intervals for two years (2001 and 2002) assimilating observations within 7-day windows yielding approximately 100 realizations from which robust estimates of the system performance could be derived. Three experiments were conducted to compare different configurations of the system.

In the first configuration (Exp1), we assimilate SSH and SST and use the XBT data to validate the resulting nowcasts and find that the resulting inversion has modest skill (almost comparable to the ROMS model without assimilation) in nowcasting the subsurface temperatures observed by the XBTs, except in the top 100 m because SST was assimilated. The second experiment (Exp2) constitutes a benchmark over which improvements to the assimilation system can be evaluated. In this experiment, we assimilated SSH, SST, and the XBTs, but by assimilating the XBTs we are left without a large independent dataset for skill assessment. The ocean state estimation improves considerably in the vicinity of the XBT transects, but the results of Exp1 would suggest that away from the XBT observation locations the subsurface skill may be poor. One would prefer to have a system that could have similar improvements in subsurface skill in places not sampled by the XBTs.

The incomplete subsurface projection of the surface satellite information (notably the SSH) in the first experiment has been noted previously in ocean data assimilation and has been addressed using several approaches. These include improved modeling of the background covariance matrix used in the assimilation (W05); considerations of water mass properties and potential vorticity conservation (Cooper and Haines 1996); empirical linear models that exploit the strong coupling of the SSH with the subsurface temperature and salinity (Dobricic et al. 2005); and methods that estimate salinity based on temperature observations, SSH, and temperature–salinity relations (Vossepoel et al. 2002).

In this study, we use a variant of the latter method by including synthetic pseudo-observations of temperature and salinity. These pseudo-observations, provided by CSIRO, are derived by linear regressions at different depths of observed hydrographic data and contemporaneous satellite data. The resulting empirical model then produces three-dimensional fields of temperature and salinity, given gridded composite maps of SSH and SST.

Our third experiment (Exp3) consisted of assimilating SSH, SST, and the empirically determined SynCTD. We found that the skill in nowcasting the temperatures observed by the XBTs is similar to the case when the XBTs themselves are assimilated, but with the advantage that correct subsurface corrections are applied everywhere and not just along the XBT transects. When used in conjunction with the ROMS IS4DVAR system, the methodology expands the SynCTD information by finding the full state vector that is consistent with the model dynamics and the prescribed surface and boundary forcing therefore providing a dynamically consistent analysis that can be used for diagnostic studies. Being constrained by the model dynamics, the analysis at the end of the assimilation period can then be used to launch a forecast of the three-dimensional circulation. We performed a forecast for each case of the 100 initial condition analyses in Exp3 and compared these to the surface (SSH) and subsurface (XBT) data. Both comparisons demonstrated that reliable forecasts of the three-dimensional circulation can be obtained for 10 days on average. The horizon of utility of the forecast, however, varied from window to window, with cases where predictability extends up to 2 weeks and a few cases where it is only 1–3 days. Why certain time periods or mesoscale states should prove difficult to predict regardless of the success in fitting the model initial conditions to data will be explored in future work.

Another interesting aspect of our methodology is the impact in ensemble forecasting. As an example, we computed a 10-member ensemble forecast by perturbing the initial conditions chosen to maximize the energy growth following the formalism of Farrell and Ioannou (1996a), scaling the perturbations to have a realistic SSH error of 3 cm. Figure 10 compares the mean of the ensemble forecast (filled contours) with the AVISO SSH at different lead times (1, 8, and 15 days in advance). Two cases are considered: in the first case (Fig. 10, top) the background trajectory is taken from Exp1 and in the second case (Fig. 10, bottom) the background trajectory is taken from Exp3. Although both ensembles start considerably close to the AVISO SSH, the erroneous vertical projection in Exp1 causes great spread in the ensemble forecast (not shown), resulting in an ensemble mean that quickly departs from the AVISO SSH. On the other hand, the more realistic vertical projection of the SSH and SST information in Exp3 produced an ensemble with smaller spread (not shown) and an ensemble mean forecast that is in better agreement with the AVISO SSH. Finding the optimal initial perturbations is an expensive calculation and therefore presenting a more detailed analysis over all the assimilation cases is beyond the objectives of this work.

Example of an ensemble forecast at (left to right) 1, 8, and 15 days for (top) Exp1 and (bottom) Exp3. The ensemble members consist of slightly perturbed initial conditions that maximize the energy growth, and the ensemble mean (filled contours) is compared with the AVISO SSH (thick contours) in each panel.

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

Example of an ensemble forecast at (left to right) 1, 8, and 15 days for (top) Exp1 and (bottom) Exp3. The ensemble members consist of slightly perturbed initial conditions that maximize the energy growth, and the ensemble mean (filled contours) is compared with the AVISO SSH (thick contours) in each panel.

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

Example of an ensemble forecast at (left to right) 1, 8, and 15 days for (top) Exp1 and (bottom) Exp3. The ensemble members consist of slightly perturbed initial conditions that maximize the energy growth, and the ensemble mean (filled contours) is compared with the AVISO SSH (thick contours) in each panel.

Citation: Journal of Physical Oceanography 42, 9; 10.1175/JPO-D-11-0168.1

It is noted that the structure of these perturbations is optimal for energy growth but not necessarily is consistent with the error analysis, which is unknown. Indeed, one limitation of 4DVAR is that it does not provide an estimate for the analysis error covariance, which is computationally prohibitive to calculate. One possibility is to estimate a reduced rank approximation based on an expansion of Lanczos vectors of the Hessian. These Lanczos vectors are computed as part of the inner loops to improve convergence of the conjugate-gradient algorithm in IS4DVAR. Therefore, it is feasible to compute the diagonal of the analysis covariance and its leading eigenvectors, which might provide useful information about relative levels of uncertainty in each state variable and the dominant patterns of error (Moore et al. 2011a). Another possibility that just requires the forward version of the model is the multiscale multivariate approach of Lermusiaux (2002) to estimate perturbations dynamically consistent with the background error covariance. These are feasible improvements to the less rigorous approach presented here.

## Acknowledgments

We thank two anonymous reviewers for comments and suggestions that greatly improved this manuscript. This work was funded by the Office of Naval Research. We thank P. Posey and R. Preller at NRL for providing the NOGAPS and NCOM data. We thank D. Griffin at CSIRO for providing the SST and SynCTD data and for helpful discussions throughout the work.

## REFERENCES

AVISO, 1996: Merged TOPEX-Poseidon products (GDR-Ms). 3rd ed. AVISO User Handbook AVI-NT-02-101-CN, 201 pp.

Bennett, A. F., B. S. Chua, D. E. Harrison, and M. J. McPhaden, 2000: Generalized inversion of Tropical Atmosphere–Ocean (TAO) data and a coupled model of the tropical Pacific. Part II: The 1995–96 La Niña and 1997–98 El Niño.

,*J. Climate***13**, 2770–2785.Bowen, M. M., J. L. Wilkin, and E. W. L., 2005: Variability and forcing of the East Australian Current.

,*J. Geophys. Res.***110**, C03019, doi:10.1029/2004JC002533.Chapman, D. C., 1985: Numerical treatment of cross-shelf open boundaries in a barotropic coastal ocean model.

,*J. Phys. Oceanogr.***15**, 1060–1075.Cooper, M. C., and K. Haines, 1996: Data assimilation with water property conservation.

,*J. Geophys. Res.***101**, 1059–1077.Courtier, P., J.-N. Thépaut, and A. Hollingsworth, 1994: A strategy for operational implementation of 4D-Var, using an incremental approach.

,*Quart. J. Roy. Meteor. Soc.***120**, 1367–1387.Cummings, J. A., 2005: Operational multivariate ocean data assimilation.

,*Quart. J. Roy. Meteor. Soc.***131**, 3583–3604.Cummings, J. A., and Coauthors, 2009: Ocean data assimilation systems for GODAE.

,*Oceanography***22**, 97–109.Di Lorenzo, E., A. M. Moore, H. G. Arango, B. D. Cornuelle, A. J. Miller, B. Powell, B. S. Chua, and A. F. Bennett, 2007: Weak and strong constraint data assimilation in the inverse Regional Ocean Modeling System (ROMS): Development and application for a baroclinic coastal upwelling system.

,*Ocean Modell.***16**(3–4), 160–187.Dobricic, S., N. Pinardi, M. Adani, A. Bonazzi, C. Fratianni, and M. Tonani, 2005: Mediterranean Forecasting System: An improved assimilation scheme for sea-level anomaly and its validation.

,*Quart. J. Roy. Meteor. Soc.***131**, 3627–3642.Dunn, J., and K. R. Ridgway, 2002: Mapping ocean properties in regions of complex topography.

,*Deep-Sea Res. I***49**, 591–604.Evensen, G., 2006:

*Data Assimilation: The Ensemble Kalman Filter*. Springer, 307 pp.Farrell, B. F., and P. J. Ioannou, 1996a: Generalized stability theory. Part I: Autonomous operators.

,*J. Atmos. Sci.***53**, 2025–2040.Farrell, B. F., and P. J. Ioannou, 1996b: Generalized stability theory. Part II: Nonautonomous operators.

,*J. Atmos. Sci.***53**, 2041–2053.Flather, R. A., 1976: A tidal model of the northwest European continental shelf.

,*Mem. Soc. Roy. Sci. Liege***6**, 141–164.Fujii, Y., and M. Kamachi, 2003: Three-dimensional analysis of temperature and salinity in the equatorial Pacific using a variational method with vertical coupled temperature-salinity empirical orthogonal function modes.

,*J. Geophys. Res.***108**, 3297, doi:10.1029/2002JC001745.Fukumori, I., 2002: A partitioned Kalman filter and smoother.

,*Mon. Wea. Rev.***130**, 1370–1383.Gangopadhyay, A., and A. R. Robinson, 2002: Feature-oriented regional modeling of oceanic fronts.

,*Dyn. Atmos. Oceans***36**, 201–232.Godfrey, J. S., G. R. Cresswell, T. J. Golding, A. F. Pearce, and R. Boyd, 1980: The separation of the East Australian Current.

,*J. Phys. Oceanogr.***10**, 430–440.Haidvogel, D. B., H. Arango, K. Hedstrom, A. Beckmann, P. Malanotte-Rizzoli, and A. F. Shchepetkin, 2000: Model evaluation experiments in the North Atlantic basin: Simulations in nonlinear terrain-following coordinates.

,*Dyn. Atmos. Oceans***32**(3–4), 239–281.Haidvogel, D. B., and Coauthors, 2008: Ocean forecasting in terrain-following coordinates: Formulation and skill assessment of the Regional Ocean Modeling system.

,*J. Comput. Phys.***227**, 3595–3624.Kurapov, A. L., A. L. Egbert, J. S. Allen, and R. N. Miller, 2009: Representer-based analyses in the coastal upwelling system.

,*Dyn. Atmos. Oceans***48**, 198–218.Large, W. G., J. C. McWilliams, and S. C. Doney, 1994: Oceanic vertical mixing: A review and a model with a nonlocal boundary layer parameterization.

,*Rev. Geophys.***32**, 363–403.Le Dimet, F.-X., and O. Talagrand, 1986: Variational algorithms for analysis and assimilation of meteorological observations: Theoretical aspects.

,*Tellus***38A**, 97–110.Lermusiaux, P. F. J., 2002: On the mapping of multivariate geostrophic fields: Sensitivities to size, scales, and dynamics.

,*J. Atmos. Oceanic Technol.***19**, 1602–1636.Lermusiaux, P. F. J., D. G. M. Anderson, and C. J. Lozano, 2000: On the mapping of multivariate geostrophic fields: Error and variability subspace estimates.

,*Quart. J. Roy. Meteor. Soc.***126**, 1387–1429.Lermusiaux, P. F. J., J. Xu, C.-F. Chen, S. Jan, L. Y. Chiu, and Y.-J. Yang, 2010: Coupled ocean-acoustic prediction of transmission loss in a continental shelfbreak region: Predictive skill, uncertainty quantification and dynamical sensitivities.

,*J. Oceanic Eng.***35**, 895–916.Le Traon, P.-Y., and F. Ogor, 1998: ERS-1/2 orbit improvement using TOPEX-POSEIDON: The 2 cm challenge.

,*J. Geophys. Res.***103**, 8045–8057.Marchesiello, P., and J. H. Middleton, 2000: Modeling the East Australian Current in the western Tasman Sea.

,*J. Phys. Oceanogr.***30**, 2956–2971.Mata, M. M., M. Tomczak, S. Wijffels, and J. A. Church, 2000: East Australian Current volume transports at 30°S: Estimates from the World Ocean Circulation Experiment hydrographic sections PR11/P6 and the PCM3 current meter array.

,*J. Geophys. Res.***105**, 28 509–28 526.Mata, M. M., S. E. Wijffels, J. A. Church, and M. Tomczak, 2006: Eddy shedding and energy conversions in the East Australian Current.

,*J. Geophys. Res.***111**, C09034, doi:10.1029/2006JC003592.Moore, A. M., H. G. Arango, E. Di Lorenzo, B. D. Cornuelle, A. J. Miller, and D. J. Neilson, 2004: A comprehensive ocean prediction and analysis system based on the tangent linear and adjoint components of a regional ocean model.

,*Ocean Modell.***7**(1–2), 227–258.Moore, A. M., H. G. Arango, E. Di Lorenzo, A. J. Miller, and B. D. Cornuelle, 2009: An adjoint sensitivity analysis of the Southern California Current circulation and ecosystem.

,*J. Phys. Oceanogr.***39**, 702–720.Moore, A. M., H. G. Arango, G. Broquet, B. S. Powell, A. T. Weaver, and J. Zavala-Garay, 2011a: The Regional Ocean Modeling System (ROMS) 4-dimenensional variational data assimilation systems. Part I—System overview and formulation.

,*Prog. Oceanogr.***91**, 50–73.Moore, A. M., and Coauthors, 2011b: The Regional Ocean Modeling System (ROMS) 4-dimenensional variational data assimilation systems. Part II—Performance and application to the California Current system.

,*Prog. Oceanogr.***91**, 50–73.Moore, A. M., and Coauthors, 2011c: The Regional Ocean Modeling System (ROMS) 4-dimenensional variational data assimilation systems. Part III—Observation impact and observation sensitivity in the California Current system.

,*Prog. Oceanogr.***91**, 74–94.Oke, P., G. B. Brassington, D. A. Griffin, and A. Schiller, 2008: The Bluelink ocean data assimilation system (BODAS).

,*Ocean Modell.***21**, 46–70.Pacual, A., Y. Faugere, G. Larnicol, and P.-Y. Le Traon, 2006: Improved description of the ocean mesoscale variability by combining four satellite altimeters.

,*Geophys. Res. Lett.***33**, L02611, doi:10.1029/2005GL024633.Powell, B. S., H. G. Arango, A. M. Moore, E. Di Lorenzo, R. F. Milliff, and D. Foley, 2008: 4DVAR data assimilation in the Intra-Americas Sea with the Regional Ocean Modeling System (ROMS).

,*Ocean Modell.***23**(3–4), 130–145, doi:10.1016/j.ocemod.2008.04.008.Rasmond, T. E., 1992: The design and testing of the Navy Operational Global Atmospheric Prediction System.

,*Wea. Forecasting***7**, 262–272.Ridgway, K. R., and J. S. Godfrey, 1997: Seasonal cycle of the East Australia Current.

,*J. Geophys. Res.***102**, 22 921–22 936.Ridgway, K. R., J. R. Dunn, and J. L. Wilkin, 2002: Ocean interpolation by four-dimensional weighted least squares—Application to the waters around Australia.

,*J. Atmos. Oceanic Technol.***19**, 1357–1375.Sasaki, Y., 1970: Some basic formalisms on numerical variational analysis.

,*Mon. Wea. Rev.***98**, 875–883.Shchepetkin, A. F., and J. C. McWilliams, 2003: A method for computing horizontal pressure-gradient force in an oceanic model with a non-aligned vertical coordinate.

,*J. Geophys. Res.***108**, 3090, doi:10.1029/2001JC001047.Shchepetkin, A. F., and J. C. McWilliams, 2005: The Regional Ocean Modeling System (ROMS): A split-explicit, free-surface, topography-following coordinates ocean model.

,*Ocean Modell.***9**, 347–404.Stammer, D., 1997: Global characteristics of ocean variability estimated from regional TOPEX/Poseidon altimeter measurements.

,*J. Phys. Oceanogr.***27**, 1743–1769.Stammer, D., and Coauthors, 2002: The global ocean circulation during 1992–1997 estimated from ocean observations and a general circulation model.

,*J. Geophys. Res.***107**, 3118, doi:10.1029/2001JC000888.Sun, C., and D. R. Watts, 2001: A circumpolar gravest empirical mode for the Southern Ocean hydrography.

,*J. Geophys. Res.***106**, 2833–2855.Talagrand, O., and P. Courtier, 1987: Variational assimilation of meteorological observations with the adjoint vorticity equation. I: Theory.

,*Quart. J. Roy. Meteor. Soc.***113**, 1311–1328.Tiburg, E. C., H. E. Hulbert, J. J. O’Brien, and J. F. Shriver, 2001: The dynamics of the East Australia Current system: The Tasman Front, the East Auckland Current, and the East Cape Current.

,*J. Phys. Oceanogr.***31**, 2917–2943.Tomczak, M. J., 1981: Bass Strait water intrusions in the Tasman Sea and mean temperature-salinity curves.

,*Aust. J. Mar. Freshwater Res.***32**, 699–708.Vossepoel, F. C., G. Burgers, and P. J. van Leeuwen, 2002: Effects of correcting salinity with altimeter measurements in an equatorial Pacific Ocean model.

,*J. Geophys. Res.***107**, 8001, doi:10.1029/2001JC000816.Walker, A., and J. L. Wilkin, 1998: Optimal averaging of NOAA/NASA Pathfinder satellite sea surface temperature data.

,*J. Geophys. Res.***103**, 12 869–12 883.Weaver, A. T., and P. Courtier, 2001: Correlation modelling on the sphere using a generalized diffusion equation.

,*Quart. J. Roy. Meteor. Soc.***127**, 1815–1842.Weaver, A. T., J. Vialard, and D. L. T. Anderson, 2003: Three- and four-dimensional variational assimilation with an ocean general circulation model of the tropical Pacific Ocean. Part I: Formulation, internal diagnostics, and consistency checks.

,*Mon. Wea. Rev.***131**, 1360–1378.Weaver, A. T., C. Deltel, E. Machu, S. Ricci, and N. Daget, 2005: A multivariate balance operator for variational ocean data assimilation.

,*Quart. J. Roy. Meteor. Soc.***131**, 3605–3625.Wilkin, J. L., and W. G. Zhang, 2007: Modes of mesoscale sea surface height and temperature variability in the East Australian Current.

,*J. Geophys. Res.***112**, C01013, doi:10.1029/2006JC003590.Wilkin, J. L., M. M. Bowen, and W. J. Emery, 2002: Mapping mesoscale currents by optimal interpolation of satellite radiometer and altimeter data.

,*Ocean Dyn.***52**, 95–103.

^{1}

Here, we refer to a dynamical operator as both the model physics and the ROMS application specific configuration for a geographical area, including its surface and boundary conditions. The properties of the dynamical operator are different for different applications.