## 1. Introduction

A knowledge of the heat and freshwater redistribution by the ocean in midlatitudes is crucial to our understanding of climate change and ocean–atmosphere interaction on global scales. In the subtropical North Pacific (SNP), very large, climatically significant variations of the heat and freshwater fluxes occur on seasonal and longer timescales. The mean seasonal cycle in SNP is characterized by a pronounced annual periodicity, especially in the region of the Kuroshio Extension (KE) where huge amounts of heat are released into the atmosphere in winter (e.g., Yan et al. 1995). South of the KE strong cooling results in convection, which is the primary source of Subtropical Mode Water (STMW) and Central Mode Water (CMW). The mode water is a large reservoir that contains the integrated temperature and potential vorticity signatures of the previous winters and may strongly affect long-term climate variations (e.g., Gu and Philander 1997; Schneider et al. 1999).

To better understand decadal climate variability in the Pacific basin, it is necessary to quantify its timescales, possibly associated with the renewal times of the North Pacific mode waters. Quantification of the local processes responsible for the heat and salt redistribution in the SNP is equally important for a more accurate assessment of the air–sea interaction mechanisms at these larger scales.

In the past decade a number of empirical studies have addressed the upper-ocean heat balance and mode water formation in the SNP (Suga and Hanawa 1990; Bingham 1992; Suga et al. 1997; Taneda et al. 2000; Qiu 2000). These analyses were mostly based upon either hydrographic in situ data or on long-term observations of the ocean surface. Recent modeling studies of water mass formation (e.g., Nurser et al. 1999; Marsh et al. 2000) were either global and did not focus on SNP, or were rather coarse (Ladd and Thompson 2001) to analyze the structure of mode water formation sites. Besides, all of these models employed a nudging technique that distorted model physics in order to prevent surface values from drifting too far from reality. Direct modeling of the upper ocean thermodynamics is a challenging task primarily because it is very difficult to simulate realistically the four-dimensional variability of the mixed layer without properly taking into account the possible errors in the forcing fields.

A promising approach in that respect is the variational data assimilation (VDA) technique, which has been extensively used in the recent years to study large-scale circulations in the World Ocean (e.g., Lee and Marotzke 1998; Yu and Malanotte-Rizzoli 1998; Wenzel et al. 2001). In application to the upper ocean VDA has an important advantage over more traditional methods: on one hand it optimally fits the available data and, on the other hand, it provides “the best” estimate of the oceanic mixed layer whose evolution exactly obeys certain dynamical laws.

These properties of the VDA technique are used in the present work to assess seasonal variability of the thermodynamical balance components of the mixed layer and of the upper ocean as a whole. In contrast to previous studies (e.g., Qiu and Kelly 1993; Qiu 2000) where the mixed layer depth (MLD) was diagnosed solely by temperature, we employ the dynamical model of Large et al. (1994), which includes salinity and velocity as well. In combination with the VDA algorithm of Yaremchuk et al. (2001), that approach provides more accurate estimates of the upper-layer transports and fluxes, which may improve our understanding of the ocean–atmosphere interaction in the region on larger temporal scales.

Another motivation of the present study is a necessity to obtain a direct quantitative description of the mode water formation in SNP including more accurate location of the mode-water formation sites. In recent years this problem has been extensively studied by indirect methods that could give only qualitative results. As an example, Suga et al. (1997) analyzed the World Oceans Data Center (WODC) temperature data and concluded that the CMW formation area is “likely situated … at 175°E–160°W extending as far south as about 37°N at 170°W.” These conclusion was made by qualitative comparison of the wintertime mixed layer depth and SST maps. We believe that more accurate determination of the intensity and mean positions of the mode water formation sites is important because these estimates may serve as a reference for the analysis of mode water formation anomalies, whose dynamics plays a significant role in long-term climate variability.

In the present study we attempt to synthesize the existing climatological data with TOPEX/Poseidon altimetry and drifter velocities in the framework of a numerical model of intermediate complexity. We focus on retrieving a seasonal cycle from the data and therefore consider the upper 1000 m of the ocean. Refined resolution in the vertical (40 levels per kilometer) and K-profile parameterization (KPP: Large et al. 1994) of the mixed layer thermodynamics provide a reasonable degree of realism in simulating the air–sea interaction. Poorly known parameters of the model, which include initial conditions and boundary conditions at the free surface and at the open lateral and lower interfaces of the model domain (Fig. 1) are optimized to fit the data. We focus on the direct assessment of the mode water formation rates, horizontal positioning of the mode water formation sites and estimating the relative role of various physical processes which govern the upper-layer dynamics in SNP.

The paper is organized as follows. In the next section we describe the data and discuss the methodology of their interpolation in space and time. In section 3 we report the results of sensitivity experiments with the interpolation algorithm. These are primarily aimed at assessment of the uncertainties of the optimized pattern of the SNP circulation. Our main results are contained in section 4 in which the seasonal cycles of the heat and salt balances are analyzed and formation rates of the major water masses are assessed. In section 5 the main results are summarized and we present our conclusions.

## 2. Methodology

To investigate SNP seasonal variability we examine the structure of the ocean state using variational assimilation of climatological data into a numerical model. This approach can be viewed as a 4D interpolation of the data constrained by the dynamical equations. The major advantage of the method is its ability to take into account uncertainties of all the data types in producing an optimized pattern of ocean circulation. That property of the interpolation algorithm becomes especially important in the regions of western boundary currents, where atmospheric forcing errors are relatively high and may induce oceanic variability inconsistent with the observed seasonal cycle of the thermohaline structure.

### a. Data

#### 1) Temperature and salinity

The climatological monthly mean temperatures and salinities (*S*) used in this study are those provided in the *World Ocean Atlas* (Levitus and Boyer 1994; Levitus et al. 1994a,b). The grid-point temperatures were converted to potential temperatures (*θ*), which will be referred to as temperatures throughout the following discussion.

To obtain a better representation of the upper mixed layer and ocean–atmosphere interaction, we use a finer grid in the vertical compared to Levitus climatology (40 against 18 levels; Fig. 2). The Levitus data were linearly interpolated onto the model levels. In horizontal the grid resolution of the model is the same as of the Levitus climatology (1°).

Uncertainties in the monthly mean values of temperature and salinity (Fig. 2) were chosen in a consistent way using interpolation errors of the Levitus data for the North Pacific Ocean. In estimating the vertical distribution of the annual mean errors, we assumed that the errors of the monthly means are uncorrelated with each other.

#### 2) Air–sea fluxes

Air–sea fluxes of heat and freshwater were extracted from the National Centers for Environmental Prediction–National Center for Atmospheric Research (NCEP–NCAR) reanalysis made available through a Web site (http://www.cdc.noaa.gov). Hellerman and Rosenstein (1983) wind stresses were used as the basic wind forcing dataset. Alternative datasets used in the present study are the Comprehensive Ocean–Atmosphere Dataset (COADS) heat and freshwater fluxes (Da Silva et al. 1995) and the NCEP–NCAR winds. They were primarily used for sensitivity studies and error estimates.

*Q*

_{θ}) and salt (

*Q*

_{S}) were obtained from the monthly mean data using

*ρ*

_{0}= 1025.5 kg cm

^{−3}is the mean density of seawater;

*C*

_{p}is the seawater thermal heat capacity computed for the monthly mean distributions of temperature

*T** and salinity

*S** at the ocean surface; and

*Q**,

*P**, and

*E** are the net surface heat flux, precipitation, and evaporation rates, respectively.

Taking into account a relatively low accuracy of the climatological estimates of *Q*_{θ} and *Q*_{S}, we assumed horizontal homogeneity of the corresponding errors. The latter were derived from the direct comparison of the COADS and NCEP monthly mean climatologies. The estimated errors are 35 W m^{−2} for the heat and 42 cm yr^{−1} for the freshwater fluxes. As noted by Marotzke and Wunsch (1993), the errors in the air–sea flux estimates could be even larger. For instance, the original COADS heat flux is biased by as much as 30 W m^{−2} with respect to the “fine-tuned” estimate, which is additionally constrained by requiring the global mean *Q** to be zero. It is necessary to note, however, that, despite its large absolute value, the heat flux error is relatively small compared to the seasonal signal, which has a typical amplitude of 400 W m^{−2} in the KE region (e.g., Yan et al. 1995).

Uncertainties in wind stress *τ* arise from errors in the wind observations and uncertainties in the nonlinear drag law used to convert wind into stress. Our estimate of the standard error (*σ*_{τ}) of the monthly means is 0.29 dyn cm^{−2}. That number is only two times less than the intra-annual amplitude of the wind stress observed in the region (Fig. 3). At the same time it is consistent with the typical scatter of the wind stress climatologies over the Pacific (Rienecker et al. 1996; Lebedev and Yaremchuk 2000) and the wind stress errors used in other inversions of climatological data (e.g., Lee and Marotzke 1997).

#### 3) Sea level

Sea level measured by the TOPEX/Poseidon altimeter was obtained from the University of Texas at Austin Center for Space Research through the Web site ftp://ftp.csr.utexas.edu. The TOPEX/Poseidon database includes a mean dynamic topography and 10-day sea-level anomaly fields about this mean available for the 7-yr period (1993–99). We computed monthly mean sea-level fields by averaging the 10-day fields for a given month and then adding the mean component, which was obtained by a assimilating climatological data (including TOPEX/Poseidon altimetry) into a steady-state model of the North Pacific Ocean (Yaremchuk 2001).

Sea level changes at a fixed location in SNP are dominated by the seasonal cycle, which can be explained in large part by changes in dynamic height relative to 1000 m. If we compute seasonal changes in density over the top 1000 m from the Levitus data, the seasonal signal of the associated dynamic height is very close to that (11 cm) observed by the TOPEX/Poseidon altimeter (Fig. 3).

We have treated separately the standard errors of the mean surface topography and the variability about this mean. The major uncertainty in the mean topography is the JGM-3 geoid model, which introduces errors ranging from 17 to 22 cm. Standard errors of the annual cycle and higher harmonics are estimated to be 4 cm based on information gathered from the available documentation (see also Tapley et al. 1994).

#### 4) Drifter trajectories

The trajectories of 1183 surface drifters that moved through the model domain between January 1987 and December 1998 were obtained from the Web site http://meds-sdmm.dfo-mpo.gc.ca. Most of the drifters were drogued between 5 and 15 m and therefore sample the velocity in the surface Ekman layer. The total number of position fixes was 746 802, the number of drifter-years was 511.2, and the average time spent by drifters in a typical model grid cell was 81 days. Differences in consecutive positions were used to estimate velocity. Outliers were defined by speeds exceeding 200 cm s^{−1} and they were discarded prior to the calculation of monthly mean velocities for each model grid cell.

To estimate standard error of the monthly mean drifter velocities for each cell we assumed (i) individual velocities of different drifters were statistically independent and (ii) their standard errors scaled with the time between the pair of fixes used to calculate the velocity. The standard error for a given cell and month was found to vary with the number of drifter days, *t,* according to 10 cm s^{−1} × *t*_{0}/*t**t*_{0} ∼ 75 days.

### b. Dynamically constrained data analysis

An updated version of the VDA scheme of Yaremchuk et al. (2001) was used for the analysis. The principle of VDA is to minimize model–data misfit by varying the set of free model parameters (control variables) so that the optimal model solution passes as close to the data as possible. Consequently, the three major components of a VDA algorithm are (i) the numerical model, (ii) the control variables, and (iii) the cost function, which measures the magnitude of the model–data misfit.

#### 1) Model

The numerical model is based on the conventional set of primitive equations under the Boussinesq and hydrostatic approximations. Because the timescales present in data are larger than 20 days, a simplified horizontal momentum balance (geostrophy plus friction) is used. Compared to the formulation of Yaremchuk et al. (2001), the two major changes of the model are (i) splitting of the buoyancy conservation into two separate equations for potential temperature *θ* and salinity *S,* constrained by the seawater equation of state in the form proposed by Ishizaki (1994), and (ii) KPP parameterization of the vertical mixing processes. Appropriate changes in the adjoint code have been made. Because of the limited differentiability of the KPP scheme, various approximations to the KPP tangent linear and adjoint codes have been tested. This point is detailed in the appendix.

#### 2) Control variables

*θ*

^{0}(

*x,*

*y,*

*z*),

*S*

^{0}(

*x,*

*y,*

*z*); their boundary values

*θ*

_{l},

*S*

_{l},

*θ*

_{b},

*S*

_{b}at the lateral and lower sides of the domain respectively; the sea surface fluxes

*Q*

_{θ}(

*x,*

*y,*

*t*),

*Q*

_{S}(

*x,*

*y,*

*t*),

*τ*(

*x,*

*y,*

*t*); and the sea surface height

*ζ*(

*x,*

*y,*

*t*) at every time step. Given the grid dimensions (Figs. 1, 2) with 164 model time steps per year, the total number

*D*of the parameters to be adjusted is 3 684 180. That greatly exceeds the number of data points

*N*∼ 10

^{6}. Redundancy in the temporal resolution (compared to the data) of the boundary control fields is removed through their Fourier expansion in time,

*ϕ*stands for either

*θ*

_{l},

*S*

_{l},

*θ*

_{b},

*S*

_{b},

*ζ,*

*τ*,

*Q*

_{θ}, or

*Q*

_{S}. The first Fourier component,

*q*

_{1}, is constant through time and its amplitude,

*a*

^{1}

_{ϕ}

*q*

_{2}(

*t*) = sin(2

*πt*/

*T*) and

*q*

_{3}(

*t*) = cos(2

*πt*/

*T*), (

*T*= 1 yr) and their amplitudes

*a*

^{2}

_{ϕ}

*a*

^{3}

_{ϕ}

*k*> 3 define higher-order harmonics of the annual cycle. With that transformation of the control space the new boundary control variables are the Fourier amplitudes

*a*

^{k}

_{ϕ}

*L*= 3 harmonics (

*D*= 183 235), although certain experiments were done at higher temporal resolutions.

#### 3) Cost function

The quadratic cost function _{d} penalizes squared differences between the above described data and their model counterparts. Weighting of the temperature and salinity misfits is horizontally homogeneous and depends upon depth as shown in Fig. 2. Surface flux weights are inversely proportional to the squared errors of the corresponding data. Sea surface topography contributes to the cost function by two terms as it is separated into the annual mean *ζ**ζ*′ fields, which have different prior errors [section 2a(3)]. Seasonality of the solution is ensured by penalizing the differences between the initial conditions and the model temperature and salinity fields after one year of integration. The second group of terms (the so-called regularization part of the *a*^{k}_{ζ}*a*^{k}_{τ}*a*^{k}_{θ}*a*^{k}_{S}

The cost function was subject to minimization under the constraints imposed by the model equations. The adjoint of the tangent linear dynamical system was used to compute the cost function gradient for the quasi-Newtonian descent algorithm of Gilbert and Lemarechal (1989). A typical optimization run required several hundred iterations to reduce the gradient norm by a factor of 1000.

## 3. Optimization experiments

Since the optimization problem considered is nonlinear the cost function *L,* horizontal diffusion coefficients, regularization weights and mixing parameterization.

The first guess solution is basically defined by the initial values of the control fields *θ*^{0}, *S*^{0}, *a*^{k}_{θl}*a*^{k}_{Sl}*a*^{k}_{θb}*a*^{k}_{Sb}*a*^{k}_{τ}*a*^{k}_{ζ}*a*^{k}_{Qθ}*a*^{k}_{QS}*θ*^{0}, *S*^{0}) or through the expansions (1) of the data. These expansions are characterized by the errors *ϵ,* which have the meaning of rms deviation of the Fourier approximation from the monthly mean data. As seen in Table 1, temporal variability is dominated by the seasonal harmonic (*L* = 3) which contributes 60%–80% to overall intra-annual variability exposed by monthly means. Taking larger *L* does not result in significant improvement of the model–data misfit after optimization (last column in Table 1). The optimized states with *L* > 3 demonstrate much higher spatial and temporal variability and do not have any objective advantages compared to the solutions with *L* = 3. Therefore we took *L* = 3 for further experiments.

Another essential parameter of the optimization algorithm is the starting time of integration. Since the major portion of the control space is occupied by initial conditions, the problem's controllability strongly depends on the structure of

Horizontal diffusion coefficients did not vary in space and time and were not controlled by the data. In a series of experiments we changed these coefficients in the range of 200–5000 m^{2} s^{−1}. Computations have shown low sensitivity of the optimal state to variations within the range of 200–1000 m^{2} s^{−1}. At larger values the optimized model–data misfit started to increase. In subsequent experiments we set the values of the diffusion coefficients to 1000 m^{2} s^{−1}.

Regarding the regularization terms, the optimal state was found to be weakly sensitive to their weight variations within an order of magnitude. This can be explained by the high degree of smoothness already present in the data. The only exception was the weight of the smoothness of total transport term. Its magnitude was optimized to obtain a compromise between the presence of unrealistic grid-scale features in the optimal transport pattern and its oversmoothing.

Finally we made several additional experiments with numerical mixing instead of KPP parameterization. Our observation was that the mixed layer depth was slightly shallower for the numerical mixing scheme, especially in winter with the maximum values of 200–220 m (against 230–250 m for KPP). This can be partly explained by the fact that wind forcing directly affects the MLD value in the KPP scheme and winds in winter are generally stronger. KPP also provided a better fit to the data; therefore, it was chosen as a parameterization scheme for the main optimization run.

To validate the overall structure of the VDA scheme we performed a series of experiments by varying the data fields within their error bars. In an ideal situation the result should not depend upon these variations if the prior statistics is correct. As an example, we show similarity of the VDA results obtained with NCEP and COADS surface fluxes (Fig. 4). Similar patterns were observed with other datasets and showed relatively weak dependence of the assimilated solution on the utilized forcing fields. In total, more than a hundred optimization experiments were performed. Results of these computations were also used for error estimation. Such an approach may seem more subjective than the one employing the inversion of the Hessian matrix (Yaremchuk et al. 2001). On the other hand, the cost function and the numerical model are also of subjective nature. By averaging over an ensemble of the “optimized states” we can go beyond the restrictions of gaussianity and certain model dynamics, providing the reader with an independent view of the stability of the results. For error estimation we have used the ensemble of 2 × 2 × 3 × 3 = 36 optimal states generated by variation of the mixing scheme (two above mentioned versions), surface forcing (NCEP versus COADS climatologies), the number of harmonics (*L* = 3, 5, 7), and diffusion coefficients (500, 1000, 2000 m^{2} s^{−1}). The optimal state analyzed below was obtained with NCEP climatology, KPP mixing, *L* = 3, and the horizontal diffusion of 1000 m^{2} s^{−1}.

## 4. The optimized state

### a. Consistency with observations

The result of the basic optimization demonstrates a significant decrease of the model–data misfit _{d} with respect to all data components. Figure 2 shows rms deviations of the potential temperature and salinity from Levitus climatology as a function of depth. Solid curves provide the prior error level of Levitus climatology. It can be seen that the optimized *θ*/*S* fields lie well within the prior error bars. The largest model–data misfits occur at the depth range of 40–60 m, where the lower boundary of the mixed layer is located on the average. A secondary maximum is near the depth of 170 m, which represents the average MLD during winter convection in the northwestern part of the domain. These features of the error field can be explained by the fact that the optimal model solution is characterized by a much sharper and spatially variable seasonal thermocline throughout the year than that present in the Levitus dataset. Nevertheless, even at these “critical” depths the error level is still acceptable in the sense of prior statistics.

Improvement of the ocean state evolution after optimization is clearly seen if we compare the results of the free model run initialized by climatological data (first guess solution) with the result of optimization (Fig. 5). Heat content in the upper layers of the ocean tends to diverge from the Levitus data as a result of inconsistency between the surface forcing and hydrology within the framework of model dynamics. Optimization of the free model parameters demonstrates, however, that this inconsistency can be removed at the expense of slight (in the sense of data statistics) variations of the control fields. Errors in the surface data specified in the cost function and rms deviations of the corresponding optimized surface controls from these data are shown in Table 2. The results indicate that the model is consistent with observations.

### b. Heat and salt budget

#### 1) Integral balances

**u**,

*w*) is the 3D velocity vector,

*A*

_{θ}and

*K*

_{θ}are the horizontal and vertical diffusivity coefficients, ∂Ω stands for the lateral boundary of Ω, and

**s**is the horizontal unit vector normal to that boundary. Similar relationships also hold for salinity. Volume fluxes are balanced through the continuity constraint adv(1) = 0.

The seasonal cycle of the advective and diffusive fluxes of heat and salt is shown in Fig. 6. The diffusive flux divergence is dominated by atmospheric forcing [second term in the rhs of (4)]: horizontal divergence of the diffusive fluxes is approximately 10 times smaller than the vertical divergence, which is entirely defined by the atmospheric flux, whereas the diffusive flux at 1000 m is negligible. On the annual mean, horizontal diffusion counterbalances only about 15%–20% of the atmospheric fluxes (0.06 PW of 0.38 PW for heat and 0.9 kton s^{−1} of 4.4 kton s^{−1} for salt). The major balance of heat and salt holds between the temporal derivative, atmospheric forcing and advection. As seen from Fig. 6, advective response to atmospheric forcing is shifted by 4–4.5 months for both temperature and salinity. While surface cooling and *E* − *P* culminate in December, attaining the values of 1.75 PW and 12 kton s^{−1}, divergence of the advective fluxes reaches the extrema in April–May. This phenomenon can be qualitatively explained in terms of the advective timescale associated with the transport of seasonal anomalies by the KE flow field.

In late spring the optimal state exhibits an intensification of the subtropical gyre, as seen in the mass balance (Fig. 7). The inflow from the west reaches its maximum of 36.6 Sv (Sv ≡ 10^{6} m^{3} s^{−1}) in April–May and minimum in late November. Note that the zonal velocity integral over the western boundary has contributions both from the KE and the westward flowing southern rim of the recirculation gyre, whose center is identified by the zero depth-integrated velocity located at 28°–29°N at 145°E. When integrated north of 30°N, the total KE transport is 68 ± 7 Sv with a seasonal variation of 11 ± 4 Sv. The corresponding sea surface height (SSH) difference across the KE varies seasonally from 141 to 158 cm with the maximum in May. This is consistent with the estimates of Hwang (1996) who analyzed altimetry and found indications of a larger SSH difference at 140°E in spring and lower in winter with an amplitude of approximately 15 cm. In their ⅓° model of the Kuroshio region Kagimoto and Yamagata (1997) found a seasonal variation of 8 Sv with a maximum in March. This estimate was made, however, for a total depth integrated transport at 134°E.

The seasonal cycle of the volume transport for the model domain can be characterized as follows. In August–September the western inflow of 30–33 Sv is practically in balance with outflow through the northern boundary. In February–April the outflow to the north decreases to 10–15 Sv owing to the reduction of the zonal pressure difference associated with the intense cooling in the west. At the same time the western inflow gains considerable compensation from the outflows through the eastern (12–15 Sv) and southern (5–6 Sv) boundaries.

#### 2) Mixed layer balances

*h.*The mixed layer depth

*h*is defined by the KPP algorithm (see appendix) and its evolution is a part of the numerical model equations. Using angular brackets for MLD-averaging operator, we can rewrite the integral balances for a tracer

*ϕ*in the following symbolic form:

_{t}

*ϕ*

_{ϕ}

_{ϕ}

_{ϕ}

_{ϕ}

In the analysis below we consider contributions from these five constituents of the integral balance, namely temporal derivative *d*/*dt* = ∂_{t}〈*ϕ*〉, horizontal advection hadv_{ϕ} = *h*〈**u**〉 · ∇〈*ϕ*〉, horizontal diffusion hdif_{ϕ} = −*A*_{ϕ}Δ〈*ϕ*〉, atmospheric forcing/vertical diffusion vdif_{ϕ} = −*h*^{−1}(*Q*_{ϕ} − *K*_{ϕ}∂_{z}*ϕ*|_{h}), and the entrainment term entr_{ϕ} = 〈*w*∂_{z}*ϕ*〉 + *D̂*〈*ϕ*〉 − 〈*D̂ϕ*〉, which accounts for the effects caused by MLD variability in space/time and vertical advection. In the above expressions *ϕ* stands for either potential temperature or salinity, *D̂* ≡ ∂_{t} + **u** · ∇ − *A*_{ϕ}Δ, where ∇ and Δ are the horizontal gradient and Laplacian operators. Note that the integral of the KPP nonlocal transport term ∂_{z}*G*_{ϕ} (see appendix) vanishes because *G*_{ϕ}(0) = *G*_{ϕ}(*h*) = 0 through its definition given by the KPP scheme.

Results of the ML temperature and salinity budget computations are summarized in Table 3. We have found that the major balance in the annual mean is between horizontal advection, vertical entrainment, and atmospheric forcing (vertical diffusion at the bottom of the mixed layer is much smaller than the diffusive flux through the ocean surface, so that vdif*ϕ* = −*Q*_{ϕ}/*h* within the accuracy of a few percent). Horizontal diffusion of temperature and salinity also plays a minor role in the budget (rows 9 and 10). It is interesting that the net effect of atmospheric forcing on the ML temperature is negative (line 9, column 6); that is, on the annual average the atmosphere tends to warm the ML. That “contradiction” with the annual mean cooling of the domain arises due to the changes in *h* standing in the denominator of the MLD-averaging operator. In late spring [April–June (AMJ)] and summer [July–September (JAS)] when major warming occurs, the magnitude of *h* is small, and the spring/summer values of vdif_{θ} overweigh their fall/winter [October–December (OND)] values in the annual average. Qiu and Kelly (1993) observed this effect in their analysis of European Centre for Medium-Range Forecasts (ECMWF) and *Geosat* data. On the annual mean, horizontal advection of temperature tends to cool the ML at the average rate of 0.3 ± 0.2°C yr^{−1} (column 3, line 9). Entrainment processes dominate over horizontal advection on the intra-annual scales. They tend to cool and freshen the upper mixed layer seasonally except for spring, when the ML experiences a sharp retreat accompanied by quick warming.

It is also instructive to analyze temperature and salinity balances of the upper mixed layer in the thermodynamically active region of the Kuroshio Extension. Figure 8 shows their evolution with time in the subdomain located within 30°–40°N, 145°–175°E. Here we see again that horizontal divergence of advective fluxes, vertical entrainment, and atmospheric forcing play the major role. Relative magnitudes of these terms in the annual mean temperature balance are distributed in an approximate proportion of 1:3:4. The atmosphere tends to warm the mixed layer in the KE region at the rate of 5.2 ± 3.2°C yr^{−1}, which is counterbalanced by entrainment and horizontal advection (−3.9 ± 2.2 and −1.4 ± 1.2°C yr^{−1}, respectively). Horizontal diffusion of temperature is negligible even on the annual mean and tends to warm the mixed layer at the rate of 0.04 ± 0.06°C yr^{−1}. Qiu and Kelly obtained a somewhat different result for approximately the same region. Their computations based upon a direct run of an empirical model of the upper mixed layer have shown that the mixed layer warming is compensated by diffusion, advection, and entrainment in approximately equal proportion. We attribute this difference to the much larger empirical eddy diffusivity coefficient used by Qiu and Kelly (8000 m^{2} s^{−1} at 1° × 0.5° resolution), which could also have a considerable effect on the magnitude of the entrainment term. Optimization experiments have shown, however, that such a large value of diffusion is rejected by the data. Its magnitude also appears to be 4–8 times higher than the typical values used in 1° simulations by ocean general circulation models.

Salinity budget (Fig. 8b) demonstrates qualitatively the same pattern of intra-annual variation as the temperature budget does. Quantitatively, in contrast to the temperature budget, horizontal advection of salinity hadv_{S} plays an equally important role with vertical entrainment entr_{S} in counterbalancing the atmospheric forcing. Relative magnitudes of these terms are distributed in an approximate proportion of hadv_{S} : entr_{S} : vdif_{S} = 1:1:2. Horizontal diffusion is much smaller throughout the entire annual cycle, but, being persistent in sign, it contributes approximately 9% to the annual mean salinity budget.

### c. Water masses

High vertical resolution and consistency of the model with both atmospheric and hydrographic data enabled us to make a more thorough analysis of the nonadiabatic processes occurring within and below the mixed layer. These processes govern, in particular, the formation and evolution of the major SNP water masses. The concept of water masses emerged from descriptive oceanography, and it is often difficult to define strictly their formation rates and evolution in terms of formal thermodynamics. In recent modeling studies (e.g., Marshall et al. 1999), water mass transformations are classified solely by density, whereas observational oceanographers in addition often use *θ,* *S,* potential vorticity, and various tracers for classification. We will use an intermediate method by combining a multivariate approach to water mass definitions with thermodynamical estimates derived from the optimal solution.

The optimized intra-annual evolution of the mixed layer depth is shown in Fig. 9. Two distinct centers of winter convection are identified at 33°N, 150°E and 38°N, 170°E in January–March (JFM). Their positions are consistent with the MLD maxima heuristically diagnosed from the Levitus climatology, but the maximum depths are somewhat larger, especially for the eastern site (218 m compared to 180 m obtained by Huang and Qiu 1994). Locations of the MLD maxima approximately correspond to formation sites of the STMW in the western part of the domain and CMW in the east. The optimized ocean state allows us to make a direct analysis of the formation and distribution processes which govern the dynamics of these water masses.

#### 1) Central Mode Water

CMW is defined as the water mass of low potential vorticity *q* < 2 × 10^{−10} m^{−1} s^{−1} around *σ*_{θ} = 26.2 with the core temperature ranging within 9°–13°C (e.g., Kawabe and Taira 1995; Hanawa and Talley 2001). Nakamura (1996) defined CMW by the temperature range of 8.5°–11.5°, salinity range 34.1–34.35 psu, and *q* < 1.5 × 10^{−10} m^{−1} s^{−1}. In our study we define CMW as *q* < 1.9 × 10^{−10} m^{−1} s^{−1}, 9.7° < *θ* < 13°C, and 26.0 < *σ*_{θ} < 26.4. For better capturing of this water mass we displaced the computational grid (Fig. 1) 13° west and 5° north.

In the annual mean, CMW occupies a broad region between 160°E and 155°W (Fig. 10a). Its southern boundary reaches 25°N. From the north CMW is bounded by the Kuroshio bifurcation front at 41°N. The depth of CMW increases southward from 90–220 m at 41°N to 350–450 m at 27°N in correspondence with the deepening of *σ*_{θ} = 26.2. The annual mean volume of this water mass deduced from the optimized solution is estimated as *V*_{CMW} = 7.3 × 10^{5} ± 1.2 × 10^{5} km^{3}.

The major part of CMW is formed in February–March within the region bounded by 37°–40°N, 160°E–170°W. In February the MLD reaches its maximum value of 215 m near 36.5°N, 175°E. In March the maximum MLD increases to 240 m and the region of the deepest convection displaces westward to 171°E. Left panels in Fig. 9 demonstrate the formation process of CMW at 39.5°N. In winter the region of maximum MLD values is located almost 30° east of the region of the maximum cooling rate of the ocean surface. This is due to the strong advection of the warm Kuroshio waters from the west that prevents the development of deep convection. East of 160°E horizontal advection decreases and the MLD distribution is governed by the vertical position of CMW, which has been formed a year before. Around the date line the upper boundary of this “old” CMW reaches its shallowest value of 90 m (Fig. 9 and 10a), triggering intense convection that reaches depths of 220–240 m in February–March.

The horizontal structure of the CMW formation site (Figs. 11a,b) shows that the “center of action” is located at 38°N, 173°E. This is almost 20° west of its position diagnosed by Suga et al. (1997) who derived their estimates by comparing surface temperature with MLD in winter. We attribute this inconsistency to possible errors in the definition of MLD solely by the temperature difference. It is noteworthy that our optimized states have also shown the maximum MLD values at 170°–175°W, if the latter were defined as depth where the temperature decreases 1°C from the surface. In that respect our interpolation patterns do not contradict the results of Suga et al. (1997). The optimized MLD distribution as diagnosed by the KPP scheme has a CMW related maximum at 38°N, 172°E. This is the point where convection reaches the CMW body in January (right panels in Fig. 9). In February–March the contact area of convection with the deep CMW expands zonally, occupying the region between 160°E and the date line (yellow colors in Figs. 11a,b). In March–April it spreads eastward to 170°W and then CMW formation is stopped by spring warming.

We estimated the total amount of CMW production in winter *δV* by taking the difference between the volume of CMW immediately after the downturn of winter convection and establishment of the seasonal thermocline at the beginning of April and CMW volume just before the first convective column touched the main body of CMW at the beginning of January. The optimized ocean state demonstrates that between April and January the main CMW body displaces 4°–5° eastward and 1.5°–2° southward, while its total volume contracts by *δV* = 9.7 ± 1.8 (×10^{4} km^{3})(3.1 ± 0.5 Sv) with the renewal time estimate of *V*/*δV* ∼ 7.5 ± 1.9 years. Another way to estimate the life span of the CMW water is to assess the mean travel time of the water parcels released at the formation site (Figs. 11a,b) until they leave the CMW body. These computations give the renewal time estimate of 6.9 ± 3.6 yr, which is consistent with the previous one.

#### 2) Subtropical Mode Water

STMW is defined as the water mass with the core temperature 15°–18°C, salinity 34.6–34.8 psu, and densities of 25.1–25.7 (e.g., Kawabe and Taira 1998). Taking into account definitions given by other authors, we specified STMW by the same temperature and density ranges with the potential vorticity constraint *q* < 1.8 × 10^{−10} m^{−1} s^{−1}.

The annual mean position of the STMW is shown in Fig. 10b. North of 30°N STMW consists of two distinct bodies separated by a layer of higher potential vorticity (1.8 × 10^{−10} < *q* < 2.0 × 10^{−10}) at the depth of approximately 200–220 m. South of 30°N these bodies join at 250 m and gradually descend to depths of 300–350 m in correspondence with the general flow of water at *σ*_{θ} = 25.4 (Figs. 11c,d). As seen from the same figure STMW formation occurs in February–April within the region of 31°–35°N, 146°–175°E. This result is consistent with an estimate of Suga and Hanawa (1990) who analyzed MLD distribution south of KE and placed the formation area of 15°–18°C STMW waters east of 145°E and north of 30°N, but did not define its eastern extent. Kawabe and Taira (1998), on the other hand, give some experimental evidence that at 165°E STMW continues to form even in May. Convective columns reach 190–220 m (Fig. 12), refreshing the “upper” body of STMW and leaving the lower body practically unaffected. Therefore we interpret the upper body as a source of STMW water whose volume at 31.5°N is subjected to large seasonal variability (Fig. 12). To estimate the annual mean volume of STMW we used three sources of data: (i) the original Levitus climatology, (ii) a steady-state inversion of the WOCE climatology (Yaremchuk 2001), and (iii) the results of our optimization. The first two sources were necessary to assess the STMW volume west of 142°E, which is beyond the limits of our computational domain. Comparison of these estimates have shown their reasonable stability with a mean volume of 1.1 ± 0.2 (×10^{6} km^{3}). The estimates were also found to be fairly consistent with the diagnostic computations of Huang and Qiu (1994), who obtained the volume of 1.05 × 10^{6} km^{3} for the water mass in the same density range west of 180°. Seasonal gain *δV*_{STMW} assessed from the optimal solution in the same manner as for CMW is 1.08 ± 0.2 (×10^{5} km^{3})(3.8 ± 0.6 Sv) and yields an estimate for the renewal time of 10 ± 3 yr.

The mean travel time of the water parcels within the STMW body, estimated from the steady state inversion of Yaremchuk (2001), is 10.7 ± 4.2 yr, confirming the above mentioned “direct” estimate. The estimated advection rate of the Lagrangian particles was also found to be consistent with the recent results of STMW observations at the Ogasawara site (27.5°N, 142°E) presented by Taneda et al. (2000).

Examination of the horizontal structure of the STMW formation site reveals an area of the most intense formation around 34°N, 148°E (Figs. 11c,d). The process starts in January around that location and then gradually spreads eastward with a small southern component. By the end of February the STMW formation site occupies an area of 0.5 × 10^{6} km^{2} in the region of 32°–35°N, 146°–166°E (red and yellow colors in Fig. 11). In March and April the STMW formation area expands by a factor of almost 2, reaching 175°E at 34°N and 30°N around 160°E. Analysis of the optimized states (for both NCEP and COADS data inversions) has shown that evolution of the STMW formation area is correlated with the kinematics of the surface flow. In January, accumulation of the negative buoyancy in surface layers becomes large enough to overcome the supply of warm water from the west and enables convection to reach the 18°C STMW thermostad at 34°N, 148°E. After that the process starts getting a positive feedback from the westerly advection, and the STMW formation area rapidly expands downstream (Figs. 11c,d) until atmospheric forcing gets weaker in March–April.

#### 3) Intermediate water

The North Pacific Intermediate Water (NPIW) is defined as the water mass with a distinct salinity minimum residing at the density range of 26.7 < *σ*_{θ} < 27.0 with the density of its core layer increasing northward from 26.65 at 20°N to 26.85 at 40°N (e.g., Kawabe and Taira 1998). Isopycnals that deep do not outcrop within the model domain, and therefore NPIW formation cannot be studied directly. We can, however, estimate the mean flow rate of NPIW across the computational domain. Analysis of the acceleration potential maps for 26.65 < *σ*_{θ} < 26.85 and Lagrangian particle computations show that the mean southward velocity component of the NPIW water parcels is 0.52 ± 0.16 cm s^{−1}. This is consistent with the result of Watanabe et al. (1994), who estimated the southward spreading rate as 0.42 ± 0.08 cm s^{−1} from chlorofluorocarbon observations.

## 5. Discussion and conclusions

This study presents an attempt to estimate intra-annual variability of the North Pacific subtropical gyre by means of synthesis of climatological data with satellite altimetry and drifters in the framework of a numerical model with controlled boundaries. Special attention has been paid to the accurate representation of the upper mixed layer, which made it possible to analyze its dynamics and the processes of formation and evolution of the NP mode waters.

Analysis of the heat and salt budgets in the upper 1000 m of the ocean has shown that the major integral balance within the model domain is between atmospheric forcing and horizontal advection of these properties. In the annual mean, horizontal diffusion contributes less than 10% to the net heat and salt budgets. This type of balance also holds for the upper mixed layer. We attribute the relatively small role of horizontal diffusion to two factors. First, it is due to smoothed hydrological data, which were used to constrain the model. Second, we have used a relatively low diffusion coefficient (1000 m^{2} s^{−1}) in order to avoid dependence of the result on the eddy transport, which is not properly presented either in the model or in the data. Our results also suggest that assessment of the seasonal cycle of the ML temperature (salinity) budgets cannot be regarded as a robust procedure, because it is extremely sensitive to the MLD dynamics. Using an empirical ML model forced by ECMWF winds and *Geosat* altimetry, Qiu and Kelly (1993) also obtained a solution consistent with Levitus climatology but with a different ML temperature balance. To cope with this sensitivity problem we have conducted an ensemble of optimization runs by varying the parameters of the assimilation scheme. The errors (Table 3) show statistical consistency of the estimates with respect to the ensemble but do not rule out alternative scenarios of ML heat/salt balances in the mixed layer.

Assimilation experiments gave an indication of the seasonal changes in the structure of the upper 1000 m of the NP subtropical gyre. Its transport is diagnosed to have a maximum (79 ± 8 Sv) in June–July and minimum in December–January (56 ± 6 Sv). The annual mean estimate of 68 ± 7 Sv appears to be in good agreement with experimental evidence of Hall (1994), Clifford and Horton (1992), and Teague et al. (1994) who respectively analyzed long-term velocity records at 35°N, 152°E, Generalized Digital Environmental Model climatology, and a hydrographic section at 143°E. Intra-annual variation of the KE transport was a subject of direct modeling studies (e.g., Kagimoto and Yamagata 1997) but has never been analyzed through synthesis of a large variety of data as it is done in the present work. Regarding the available observational results, Kawabe (1988) assessed the KE transport variability from tidal gauge records and got some evidence of a larger KE transport in summer, while Hwang (1996) analyzed altimetry data and obtained an indication of a seasonal cycle at 140°E whose phase and amplitude are in qualitative agreement with our results. It is also noteworthy that the annual cycle of KE transport agrees in phase with the variation of the Gulf Stream extension in the Atlantic (Yaremchuk et al. 2001).

This study is the first attempt to estimate quantitatively the formation processes of the NP mode waters by combining a model with climatological data in a statistically consistent manner. We note, however, that the estimates remain somewhat uncertain partly because of the relative uncertainty in the definition of water masses. For instance, STMW would have been presented as a single body north of 31°N if we used a higher threshold value for potential vorticity (*q* < 2 × 10^{−10} m^{−1} s^{−1} instead of 1.8 × 10^{−10} m^{−1} s^{−1}) in its definition. Accordingly the STMW renewal time would have a higher value (12 yr instead of 10 yr). The geographical positions of the STMW and CMW formation sites (Fig. 11) were fairly stable throughout the assimilation experiments. Horizontal excursions of the corresponding “centers of action” did not exceed 1°–2°. Therefore, we assume that our computations provide a reasonable approximation of their positions as defined by the mean seasonal cycle in climatological data. In previous data analyses (e.g., Nakamura 1996; Suga et al. 1997) the places of formation of STMW and CMW were determined indirectly by comparing hydrological and atmospheric data, and the accuracy of these determinations did not exceed 5°–20° in longitude. A recent modeling study by Ladd and Thompson (2001) did not address that question possibly because of coarse resolution.

Results of optimization indicate that STMW and CMW are formed in January–April at the average rates of 3.8 ± 0.6 and 3.1 ± 0.5 Sv and have the typical renewal times of 10 ± 3 and 7.5 ± 2 yr, respectively. The estimates were obtained by two methods, indicating their consistency. Our values of ventilation rates are somewhat higher than those given by other authors, who relied on purely diagnostic analyses of hydrological data. For instance, Huang and Qui (1994) analyzed Levitus hydrology and obtained 3.2 Sv for the ventilation rate of STMW, which is 20% lower than our value. That can be partly explained by certain underestimation of the MLD diagnosed from smoothed climatology, which in turn may cause underestimation of the fluxes in the ventilated thermocline. Our inversion may also suffer from underestimation of the mesoscale eddy activity, caused by smooth data. In that respect it is likely that the real mode water renewal times are somewhat lower than our estimates, especially for STMW, whose intra-annual evolution is much more affected by the errosive influence of the KE eddies.

In general, estimates of the renewal time are subject to even larger uncertainties than the estimates of production rates because variations in the water mass definitions often cause significant scatter in the assessment of their volume. As an example, Ladd and Thompson (2001) obtained at a value of 10–14 yr for the CMW renewal time. Discrepancy with our estimate is caused by a large difference in definition of the CMW, which resulted in a much larger volume (3.3 × 10^{6} km^{3}) leading to a longer renewal time. Estimates of the STMW volume obtained by Huang and Qiu (1994) and Ladd and Thompson (2001) are somewhat more stable [1.05 and 0.7 (×10^{6} km^{3})] and result in the renewal times of 10.1 and 5.5 yr, respectively.

The optimal values of the air–sea heat, freshwater, and momentum fluxes were found to be within the error bars of the corresponding data. In the annual mean the largest bias from climatology is observed for the heat flux to the atmosphere, which was found to be 10 W m^{−2} smaller than that of NCEP climatology. That was mostly due to the decrease of the optimal estimate of the heat flux north of 35°N in winter. Biases in the freshwater and momentum fluxes were not so significant and had the values of *δQ*_{S} = −1.3 cm yr^{−1} and {*δτ*_{x}, *δτ*_{y}} = {0.09, 0.01} dyn cm^{−2}, respectively. We have also observed that utilization of the COADS surface fluxes resulted in a slightly worse fit of the solution to the data.

The major restriction of the present analysis is the assumption that the NP subtropical gyre is in a quasi-steady cyclical state. Observations indicate, however, that in recent decades the KE system has undergone a number of dramatic changes, reflecting its strong variability at the ENSO and decadal time scales (e.g., Qiu 2000; Hanawa and Yoritaka 2001). Interannual variations of the currents and surface forcing may cause significant migration of the mode water formation sites and subsequently alter the pathways of its propagation. A more adequate approach to the problem of NP variability must involve assimilation of the real long-term measurements and integration periods of at least 10–15 years without the periodicity constraints. This technique does not separate seasonal and interannual scales, which may be strongly coupled via the water mass formation processes. On the other hand, results of such an assimilation might be much less reliable statistically because of inadequacy of the data density for controlling the ocean state via primitive equation dynamics. In that sense the presented inversion is in a more favorable position, at least from the viewpoint of formal statistics.

## Acknowledgments

This study was supported by NASA Grant NAG5-7485, NSF Grant OCE-9525986, and the Frontier Research system for Global Change through its funding of the International Pacific Research Center. The Center for Space Research of the University of Texas is acknowledged for providing the preprocessed TOPEX/Poseidon data.

## REFERENCES

Bingham, F. M., 1992: Formation and spreading of subtropical mode water in the North Pacific.

,*J. Geophys. Res.***97****,**11177–11189.Clifford, M. A., and C. W. Horton, 1992: Baroclinic transport of the Kuroshio Extension between 141°E and 165°E.

,*J. Geophys. Res.***97****,**14445–14459.Da Silva, A., C. C. Young, and S. Levitus, 1995:

*Atlas of Surface Marine Data*. NOAA Atlas NESDIS 6, National Oceanographic Data Center, 299 pp.Gilbert, J. Ch, and C. Lemarechal, 1989: Some numerical experiments with variable-storage quasi-Newton algorithms.

,*Math. Progr.***45****,**407–435.Gu, D., and S. G. H. Philander, 1997: Interdecadal climate fluctuations that depend on exchange between Tropics and extratropics.

,*Science***275****,**805–807.Hall, M. M., 1994: Velocity and transport structure of the Kuroshio Extension at 35°N, 143°E.

,*J. Geophys. Res.***99****,**7639–7650.Hanawa, K., and H. Yoritaka, 2001: North Pacific Subtropical Mode Waters observed in long XBT sections along 32.5° line.

,*J. Oceanogr.***57****,**679–692.Hanawa, K., and L. Talley, 2001: Mode waters.

*Ocean Circulation and Climate,*G. Siedler et al., Ed., Academic Press, 373–386.Hellerman, S., and M. Rosenstein, 1983: Normal monthly wind stress over the World Ocean with error estimates.

,*J. Phys. Oceanogr.***13****,**1093–1104.Huang, R. X., and B. Qiu, 1994: Three-dimensional structure of the wind-driven circulation in the subtropical North Pacific.

,*J. Phys. Oceanogr.***24****,**1608–1622.Hwang, C., 1996: A study of Kuroshio's seasonal variabilities using altimetric–gravimetric geoid and TOPEX-Poseidon altimeter data.

,*J. Geophys. Res.***101**((C3),) 6313–6335.Ishizaki, H., 1994: A simulation of the abyssal circulation in the North Pacific Ocean.

,*J. Phys. Oceanogr.***24****,**1941–1954.Kagimoto, T., and T. Yamagata, 1997: Seasonal transport variations of the Kuroshio: An OGCM simulation.

,*J. Phys. Oceanogr.***27****,**403–418.Kawabe, M., 1988: Variability of Kuroshio velocity assessed from the sea level difference between Naze and Nishinoomote.

,*J. Oceanogr. Soc. Japan***44****,**293–304.Kawabe, M., and K. Taira, 1995: Flow distribution at 165°E in the Pacific Ocean.

*Biogeochemical Processes and Ocean Flux in the Western Pacific,*H. Sakai and Y. Nozaki, Eds., Terra Scientific, 629–649.Kawabe, M., . 1998: Water masses and properties at 165°E in the western Pacific.

,*J. Geophys. Res.***103****,**12941–12958.Ladd, C., and L. Thompson, 2001: Water mass formation in an isopycnal layer model of the North Pacific.

,*J. Phys. Oceanogr.***31****,**1517–1537.Large, W. G., J. C. McWilliams, and S. C. Doney, 1994: Oceanic vertical mixing: a review and a model with a nonlocal boundary layer parameterization.

,*Rev. Geophys.***32****,**363–403.Lebedev, K., and M. Yaremchuk, 2000: A diagnostic study of the Indonesian throughflow.

,*J. Geophys. Res.***105****,**11243–11258.Lee, T., and J. Marotzke, 1997: Inferring meridional mass and heat transports of the Indian Ocean by fitting a general circulation model to climatological data.

,*J. Geophys. Res.***102****,**10585–10602.Lee, T., . 1998: Seasonal cycles of meridional overturning and heat transport of the Indian Ocean.

,*J. Phys. Oceanogr.***28****,**923–943.Levitus, S., and T. P. Boyer, 1994:

*Temperature*. Vol. 4,*World Ocean Atlas 1994,*NOAA Atlas NESDIS 4, 117 pp.Levitus, S., and J. Antonov, 1994a:

*World Ocean Atlas 1995*. NOAA Atlas NESDIS 5, 176 pp.Levitus, S., R. Burgett, and T. P. Boyer, 1994b:

*Salinity*. Vol. 3,*World Ocean Atlas*1994, NOAA Atlas NESDIS 3, 99 pp.Marotzke, J., and C. Wunsch, 1993: Finding a steady state of general circulation model through data assimilation: Application to the North Atlantic Ocean.

,*J. Geophys. Res.***98****,**20149–20167.Marsh, R., A. J. G. Nurser, A. P. Megann, and A. New, 2000: Water mass transformation in a Southern Ocean of a global isopycnal coordinate GCM.

,*J. Phys. Oceanogr.***30****,**1013–1045.Marshall, J., D. Jamous, and J. Nilsson, 1999: Reconciling thermodynamic and dynamic methods for computation of water-mass transformation rates.

,*Deep-Sea Res.***46A****,**545–572.Nakamura, H., 1996: A pycnostad on the bottom of the ventilated portion of the central subtropical North Pacific: Its distribution and formation.

,*J. Oceanogr.***13****,**1847–1867.Nurser, A. J. G., R. Marsh, and R. G. Williams, 1999: Diagnosing water mass formation from air–sea fluxes and surface mixing.

,*J. Phys. Oceanogr.***29****,**1468–1487.Qiu, B., 2000: Interannual variability of the Kuroshio Extension system and its impact on the wintertime SST field.

,*J. Phys. Oceanogr.***30****,**1486–1502.Qiu, B., and K. A. Kelly, 1993: Upper-ocean heat balance in the Kuroshio Extension region.

,*J. Phys. Oceanogr.***23****,**2027–2041.Rienecker, M. M., R. Atlas, S. Schubert, and C. S. Willett, 1996: A comparison of surface wind products over the North Pacific.

,*J. Geophys. Res.***101**((C1),) 1011–1023.Schneider, N., A. J. Miller, M. A. Alexander, and C. Deser, 1999: Subduction of decadal North Pacific temperature anomalies: Observations and dynamics.

,*J. Phys. Oceanogr.***29****,**1056–1070.Suga, T., and K. Hanawa, 1990: The mixed-layer climatology in the northwestern part of the North Pacific subtropical gyre and the formation of Subtropical Mode Water.

,*J. Mar. Res.***48****,**543–566.Suga, T., Y. Takei, and K. Hanawa, 1997: Thermostad distribution in the North Pacific subtropical gyre: The Central Mode Water and Subtropical Mode Water.

,*J. Phys. Oceanogr.***27****,**140–152.Taneda, T., T. Suga, and K. Hanawa, 2000: Subtropical mode water variation in the northwestern part of the North Pacific subtropical gyre.

,*J. Geophys. Res.***105****,**19591–19598.Tapley, B. D., D. P. Chambers, C. K. Shum, R. J. Eans, and J. C. Ries, 1994: Accuracy assessment of the large-scale dynamic ocean topography from TOPEX/Poseidon altimetry.

,*J. Geophys. Res.***99****,**24605–24617.Teague, W. J., A. M. Schiller, and Z. R. Hallock, 1994: Hydrographic section across the Kuroshio near 35°N, 143°E.

,*J. Geophys. Res.***99****,**7639–7650.Watanabe, Y. W., K. Harada, and K. Ishikawa, 1994: Chlorofluorocarbons in the central North Pacific and southward spreading time of North Pacific intermediate water.

,*J. Geophys. Res.***99****,**25195–25213.Wenzel, M., J. Schröter, and D. Olbers, 2001: The annual cycle of the global ocean circulation as determined by 4D variational data assimilation.

*Progress in Oceanography,*Vol. 48, Pergamon, 73–119.Yan, X-H., P. Niiler, S. K. Nadiga, R. H. Stewart, and D. R. Cayan, 1995: Seasonal heat storage in the North Pacific: 1976–1989.

,*J. Geophys. Res.***100****,**6899–6926.Yaremchuk, M. I., 2001: A reconstruction of large scale circulation in the Pacific Ocean north of 10°N.

,*J. Geophys. Res.***106****,**2331–2344.Yaremchuk, M. I., D. A. Nechaev, and K. R. Thompson, 2001: Seasonal variation of the North Atlantic current.

,*J. Geophys. Res.***106****,**6835–6851.Yu, L., and P. Malanotte-Rizzoli, 1998: Inverse modeling of seasonal variations in the North Atlantic Ocean.

,*J. Phys. Oceanogr.***28****,**902–922.

## APPENDIX

### Inversion of the KPP Algorithm

*X*

_{α}(

*z*) = {

*T*(

*z*),

*S*(

*z*),

*u*(

*z*),

*υ*(

*z*)} be the 4

*n*

_{z}-dimensional vector of the grid-point values of temperature, salinity and velocity fields at a given horizontal location with

*n*

_{z}being the number of grid points in the vertical. The KPP equations, describing temporal evolution of

*X*

_{α},

*α*= 1, · · · , 4 can be written as

*C*

_{α}(

*X*) = {(

*wT*)

_{z}, (

*wS*)

_{z}, −

*fυ,*

*fu*} represents the Coriolis force and the vertical advection of temperature and salinity,

*F*

_{α}accounts for all other physical processes resolved by the numerical scheme explicitly, and

*K*

_{α}and

*G*

_{α}are the 4

*n*

_{z}-dimensional vectors of vertical diffusivities and “nonlocal transports” defined by Large et al. (1994). Both

*K*and

*G*are the complicated nonlinear functions of the mixed layer depth

*h*and

*X*;

*h*is defined as the shallowest zero of the function

*E*

_{t}is the turbulent energy, and Ri

_{c}= 0.75 is the critical Richardson number (Large et al. 1994). From this definition one can see that

*K*and

*G*are differentiable functions of

*X*almost everywhere. The exceptions are the profiles

*X*

_{c}(

*z*) satisfying

*F*(

*X*

_{c}) = ∂

_{z}Ri(

*X*

_{c}) = 0 at some

*z*≤

*h.*Infinitesimal perturbations of such profiles may cause

*finite*variations of

*h.*

*K*and

*G.*The tangent linear approximation of (A1) in the vicinity of a point

*X̃*reads

*h*≠ 0,

*K*and

*G*cannot be explicitly expressed in terms of

*X*because they depend on

*h*in the upper mixed layer whereas

*h*is the implicit function of

*X*(A2). To compute

*K*′ and

*G*′ stand for the explicit parts of the dependences of

*K*and

*G*upon

*X.*These dependences are associated with the terms that parameterize salt fingering, double diffusion, and shear instability below the mixed layer and ensure smooth transition of the diffusivity between the upper boundary layer and the interior of the ocean (see Large et al. 1994).

*c*

_{α}≡ {

*Q*

_{θ},

*Q*

_{S},

*τ*

_{x},

*τ*

_{y}} have the form

*X̃*is the adjoint variable.

The first term in square brackets corresponds to conventional parameterization of the vertical diffusivities. The second term reflects nonlocal sensitivity of the model to surface forcing in the presence of the mixed layer (*h* ≠ 0). This term emerges because the universal profiles which parameterize *K*(*z*) and *G*(*z*) within the mixed layer depend on such quantities as the Monin–Obukhov scale, friction velocity, and the buoyancy flux, which in turn depend on the surface fluxes of heat, salt, and momentum.

To ensure numerical stability we used the fully implicit scheme for time integration. The nonlinear system of equations (A1) was solved iteratively (Large et al. 1994). In the “forward code,” that approach required an inversion of the tridiagonal matrix, emerging from the discretization of (A2) at every iteration. When solving the adjoint system “nonlinear” iterations are not required. Instead, one has to solve the adjoint of (A3), which cannot be reduced to a tridiagonal form because ^{β}_{α}*n*_{z} × 4*n*_{z} system of equations at every grid point in the horizontal significantly reduces computational efficiency of the code. Storage requirements of the code also increase dramatically because the corresponding matrices have to be memorized during the forward run.

Having in mind, however, that the adjoint code has certain limitations because of limited differentiability of the KPP scheme, we conducted a number of numerical experiments with approximate solution of the adjoint equations. In the first experiment (I) we solved the adjoint of (A3) with the standard biconjugate gradient algorithm. In the second experiment we approximated ^{β}_{α}*δ**K*^{′}_{α}*δX*_{β} and *δ**G*^{′}_{α}*δX*_{β} in the right-hand side of (A4)–(A5). In that case

Results of these experiments are shown in Table A1. The second approach proved to be the most effective. It does not require much additional memory and CPU while keeping representation of the upper layer in the adjoint code within a reasonable degree of accuracy.

Errors in approximation of the first-guess control fields by Fourier harmonics and the relative reduction of the cost function

Prior errors in surface data (upper row) and rms deviations of the corresponding controls from these data after optimization

Horizontally averaged seasonal balances of the terms contributing to the ML temperature and salinity budgets (numbers are in °C yr^{−1} and psu yr^{−1}, respectively). The MLD-averaged values of temperature (°C), salinity (psu), and the mean MLD *h* (m) are shown in the last two columns. Salinity budget components are shown in the lower rows

Table A1. Numerical experiments with approximate solution of the adjoint equations: *N* is the total number of iterations required to reduce the gradient norm by a factor of 1000. CPU requirements are normalized by the values corresponding to expt III

^{*}

International Pacific Research Center Contribution Number 143, and School of Ocean and Earth Science and Technology Contribution Number 5878.