## Abstract

Ocean tracer distributions have long been used to decompose the deep ocean into constituent water masses, but previous inverse methods have generally been limited to just a few water masses that have been defined by a subjective choice of static property combinations. Through air–sea interaction and upper-ocean processes, all surface locations are potential sources of distinct tracer properties, and thus it is natural to define a distinct water type for each surface site. Here, a new box inversion method is developed to explore the contributions of all surface locations to the ocean interior, as well as the degree to which the observed tracer fields can be explained by a steady-state circulation with unchanging surface-boundary conditions. The total matrix intercomparison (TMI) method is a novel way to invert observations to solve for the pathways connecting every surface point to every interior point. In the limiting case that the circulation is steady and that five conservative tracers are perfectly observed, the TMI method unambiguously recovers the complete pathways information, owing to the fact that each grid box has, at most, six neighbors. Modern-day climatologies of temperature, salinity, phosphate, nitrate, oxygen, and oxygen-18/oxygen-16 isotope ratios are simultaneously inverted at 4° × 4° grid resolution with 33 vertical levels. Using boundary conditions at the surface and seafloor, the entire interior distribution of the observed tracers is reconstructed using the TMI method. Assuming that seafloor fluxes of tracer properties can be neglected, the method suggests that 25% or less of the water residing in the deep North Pacific originated in the North Atlantic. Integrating over the global ocean, the Southern Ocean is dominant, as the inversion indicates that almost 60% of the ocean volume originates from south of the Southern Hemisphere subtropical front.

## 1. Introduction

The basis of traditional water mass decompositions is conservation of tracer quantities and mass. The observed tracer concentration at an interior point, *C*, can be expressed as the linear combination of multiple sources,

where *m _{i}* is the mass fraction from source

*i*,

*C*is the tracer value of source

_{i}*i*, and Δ

*C*takes into account any interior sources. All mass fractions must be bounded by 0 and 1, and conservation of mass dictates that the

*m*sum to one. If the deep Pacific is hypothesized to be filled by only Antarctic Bottom Water (AABW) and North Atlantic Deep Water (NADW), then observations of phosphate (PO

_{i}_{4}) and oxygen (O

_{2}) can be used to estimate the relative proportions of the two. Over the eastern half of the deep North Pacific (30°–50°N, 140°W–180°, 1500–4000-m depth), the average PO

_{4}is 2.8

*μ*mol kg

^{−1}and O

_{2}is 90

*μ*mol kg

^{−1}. Using observed values (Gouretski and Koltermann 2004) from the bottom of the Weddell and Labrador Seas to define AABW and NADW, Eq. (1) can be expanded into three conservation equations,

and

where *m*_{aabw} and *m*_{nadw} are the mass fractions of AABW and NADW and Δ*P* is the amount of phosphate added due to remineralization of biological rain. The proportion of oxygen utilized for a given amount of Δ*P* is given by the stochiometric ratio of 1:170 (Anderson and Sarmiento 1994). Note that steady-state behavior, in the sense of constant water-mass property values, is implicitly assumed. In this specific case, there is a unique solution that the deep Pacific contains 51% AABW and 49% NADW by mass (with Δ*P* = 1.06 *μ*mol kg^{−1}). Similar means of inferring equipartition of the deep Pacific into northern and southern source waters (e.g., Broecker et al. 1998) have shaped inferences concerning the ocean circulation rates, interpretation of radiocarbon ages (e.g., Matsumoto 2007), and the hydrography of the Last Glacial Maximum (e.g., Adkins et al. 2002). As an indication of the sensitivity of this method, consider the original set of equations, (2)–(4), where AABW and NADW are defined by Ross Sea Bottom Water and Greenland Sea Bottom Water, rather than the Weddell and Labrador Sea locations. In this case, AABW has properties PO_{4} = 2.2 *μ*mol kg^{−1} and O_{2} = 250 *μ*mol kg^{−1}, whereas NADW is defined as PO_{4} = 0.9 *μ*mol kg^{−1} and O_{2} = 300 *μ*mol kg^{−1}. The deep Pacific then appears to have 67% AABW and 33% NADW, a conclusion rather different than the reported 50/50 mixture.

Furthermore, another well-observed tracer, salinity, has a North Pacific average value of 34.66 on the practical salinity scale, which is much closer to AABW values (34.65) than NADW values (34.90). If conservation of salinity is added to Eqs. (2)–(4), the equipartition of the deep Pacific into NADW and AABW becomes untenable, because mixing of two water types is insufficient to explain the phosphate, oxygen, and salinity measurements. At least a third water mass is needed. The description of the deep ocean as containing only two sources of deep water (Stommel and Arons 1960; Stommel 1962) is more convenient fiction than fact, as has been known for some time. Worthington (1981) noted that the deep Pacific is too warm and fresh to lie upon the temperature–salinity mixing line between NADW and AABW. Evidently, Antarctic Intermediate Water (AAIW) is a candidate to be a significant contributor to the deep ocean, because it is both warmer and fresher than AABW and NADW.

Johnson (2008) has recently presented a more resolved water mass decomposition through use of six conservative tracers derived from the World Ocean Circulation Experiment (Gouretski and Koltermann 2004) that permitted the resolution of seven water masses under his methodology. Each of the water masses occupies a nonnegligible volume of the ocean. Furthermore, as might also be anticipated, the relative fractions of AABW and NADW masses differ from the decomposition discussed above. In particular, Johnson (2008) places the global ratio of AABW to NADW between 1.3 and 2.4 and the deep Pacific ratio to be somewhat higher, although he notes that the result is sensitive to the subjective choice of water mass source properties.

The preceding examples illustrate a larger issue. Johnson (2008) and earlier pioneering water-mass inverse studies (e.g., Tomczak 1981; Mackas et al. 1987; Tomczak and Large 1989) have proceeded by resolving as many water masses as their collection of data and methods permit, but it is unclear how many distinct water masses are needed for a stable and accurate decomposition. NADW is at least a composite of waters that formed in the Greenland, Irminger, and Labrador Seas (Hinrichsen and Tomczak 1993) and is further modified by waters overflowing the Mediterranean Sill before exiting the North Atlantic. Likewise, different vintages of AABW are formed in multiple locations around Antarctica (Warren 1981). Even if only these few sites of deep convection or dense overflows are present, the ocean is highly turbulent and entrainment of ambient fluid always occurs, leading to the expectation that many, if not all, surface ocean points will make a nonzero contribution to the interior. Numerical model studies (e.g., Haine and Hall 2002; Primeau 2005; Khatiwala 2007) also suggest that many different surface regions contribute to the properties of the interior ocean. Many of the aforementioned observational studies have sidestepped the complicated physics of the upper ocean by defining source regions using properties at depth, rather than the surface, though designation of such an interior value still entails a subjective choice. Thus, while there is the hope that a relatively small number of water masses may be adequate for explaining the overall distribution of properties in the deep ocean, there exists a wide spectrum of potential water masses whose significance has not been observationally explored.

Besides the difficulties owing to spatial variability in water masses, there are issues that arise due to time variability. We do not expect for the surface boundary conditions, and thus the formation rate of water masses, to be constant in time. It is also unlikely that circulation pathways are steady in time. Evidence for changes in both temperature (e.g., Antonov et al. 2005; Gouretski and Koltermann 2007) and salinity (e.g., Wong et al. 1999; Bindoff and Mcdougall 2000; Dickson et al. 2003; Curry et al. 2003; Boyer et al. 2005; Curry and Mauritzen 2005) have been observed around the globe over the last few decades. Changes in temperature and salinity of Labrador Sea Water (LSW) over the last century (Yashayaev 2007; Yashayaev and Clarke 2008) have also been documented. Warming and freshening of AABW over the last few decades has been found in numerous locations (Fahrbach et al. 2004; Rintoul 2007; Johnson et al. 2007, 2008). None of this is surprising, given that atmospheric forcing of the ocean appears variable on all time scales (e.g., Hasselmann 1976).

In this paper, we seek to address issues relating to the multiplicity of water masses, source water composition, and steadiness using a new inverse method, referred to as total matrix intercomparison (TMI). TMI is essentially a well-resolved box model that combines climatological tracer observations with a water-mass pathways model. Of particular note is that TMI admits all present-day surface property combinations as potential sources to the interior. The pathways that we describe are similar in concept to “spreading,” as discussed by Wüst (1935), but now are quantified. We use TMI to explore the spectrum of many thousands of water types that compose the oceans, as well as the degree to which the observed tracer distribution can be explained by a steady-state circulation.

## 2. Method of total matrix inversion

We use climatological observations of potential temperature (*θ*), salinity (*S*), phosphate (PO_{4}), nitrate (NO_{3}), and oxygen (O_{2}) from a combination of the WOCE data and preexisting ocean measurements (Gouretski and Koltermann 2004), as well as the gridded oxygen-18/oxygen-16 isotope ratio (*δ*^{18}O) dataset of LeGrande and Schmidt (2006). We are unaware of *δ*^{18}O having been used previously to constrain a global water mass decomposition. Block averaging of the original gridded and interpolated datasets has been done to decrease the resolution to a uniform 4° horizontally with 33 vertical levels. Note that phosphate, nitrate, and oxygen are not conservative, but are nonetheless useful for mapping out water masses because the nonconservative components in each of these three tracers can be related to a single interior source term, Δ*P*, using approximate stoichiometric ratios. This leads to a set of conservation equations for the tracers in a given gridbox,

and

The left-hand side represents the observed tracer values as a function of the mass fraction *m _{i}* of a water parcel from source

*i*with property values [

*θ*

^{(i)},

*S*

^{(i)},

*δ*

^{18}O

^{(i)}, PO

_{4}

^{(i)}, NO

_{3}

^{(i)}, O

_{2}

^{(i)}] and observational noise

*η*. Within the relative errors of this approach, the mass fraction can be interchanged with a volume fraction because changes in density are very small relative to the mean density. Equations (5)–(7) express conservation of heat, salt, and

*δ*

^{18}O of seawater, while Eqs. (8)–(10) express the concentration of phosphate, nitrate, and dissolved oxygen as a function of mass fractions

*m*and Δ

_{i}*P*. Following previous studies (Anderson and Sarmiento 1994; Karstensen and Tomczak 1998), the stochiometric relationship of Δ

*P*: 15.5Δ

*N*: −170Δ

*O*is used, and the fact that these ratios are approximations (e.g., Hupe and Karstensen 2000) is addressed by the error terms

*η*,

_{p}*η*, and

_{n}*η*. The conservative tracers also have error terms. Finally, Eq. (11) expresses conservation of mass. Other tracers that could be included, with some modification to the model equations, are carbon-14 and chlorofluorocarbons, which were used by Schlitzer (2007).

_{o}If the unknown mass fraction terms and the observations are both written in vector form,

where 𝗘 is given by

and **n** is the observational noise. Similar sets of equations have been discussed extensively (e.g., Mackas et al. 1987; Tomczak and Large 1989; Bennett 1992; Karstensen and Tomczak 1997; de Brauwere et al. 2007).

Typically, the number of water masses and their properties are chosen subjectively and 𝗘 is treated as having no error. The vector **m** is then found using a nonnegative least squares method that minimizes the weighted sum of the squared noise terms subject to the constraint that **m** ≥ 0. The specific form of this weighting will be discussed in the next section. In the case that the number of unknowns (i.e., prespecified water types plus the number of nonconservative correction factors) equals the number of constraints (i.e., observed tracer types plus one for mass conservation), as is normally arranged for, the matrix 𝗘 is square. In this case, the problem appears just-determined, assuming that the nonnegativity constraints are satisfied.

If it is admitted that the entire sea surface forms distinct water properties though air–sea interaction and upper-ocean processes, the number of potential water types will certainly exceed the six or so tracers that are available at sufficient resolution. Hence, the matrix problem of Eq. (13) is underdetermined, and we expect the matrix 𝗘 to have a null space and for many solutions to exist, though this will depend on the inequality constraints. Also note that the matrix problem implicitly assumes that the source properties are constant. In the following, we outline how the steady-state assumption can be more stringently tested and how it can be used to resolve many more water masses by accounting for the geographic distribution of tracers.

In steady state, the turbulent and diffusive ocean will have at least one pathway connecting each interior point to the surface, and usually many more. Any surface that surrounds an interior point, or one that surrounds the interior point in conjunction with the sea floor, will intersect these pathways (see Fig. 1). Tracer properties at interior points can therefore be described as some combination of the properties of a surrounding surface. For a conservative tracer, this means that the surface properties provide bounds on the range of possible interior values and that no extrema can exist in the interior. This same conclusion has been more formally deduced for the case of angular momentum in the atmosphere and is known as Hide’s theorem (Hide 1969). Note that the discretization inherent to observations (and models) can lead to the appearance of local maxima or minima in tracer fields, as can observational noise, and such artifacts must be accounted for in the expected error fields. See appendix A for more discussion of the advective–diffusive basis of this assumption and the relationship between continuous and discretized spatial fields.

We seek to determine the volume contribution of each surface point to each interior point and to resolve the pathways along which this contribution occurs. For purposes of computational efficiency, our method is divided into two parts: 1) a local inversion that estimates the exchanges between neighbors and 2) a global inversion that traces surface properties to each interior point. These two steps are together referred to as the TMI method.

### a. Local inversion step

For the local inversion, a simultaneous set of equations is defined with a similar form as Eq. (13), but instead 𝗘 is defined with the source properties of the neighboring grid boxes, denoted 𝗘* _{l}*. There are six neighboring boxes, except when a box intersects the boundary. Thus, Eq. (13) can be recast as

**y**= (𝗘

*+ Δ𝗘)*

_{l}**m**+

**n**, where Δ𝗘 is the noise in 𝗘

*associated with the nearby tracer observations. We then solve for*

_{l}**m**using a total least squares method subject to

**m**being nonnegative (see appendix B for details regarding this “method of total inversion”).

The local matrix, 𝗘* _{l}*, is square because the number of neighboring boxes is equal to the number of observed tracers. In the case that the tracers are all conservative, only five would be needed to give a formally well-posed local problem, whereas it takes six tracers in this nonconservative case. In general, 𝗘

*is expected to be invertible when constructed from real data. If, however, two neighboring data points have a similar combination of tracer properties, for example, due to homogenization in the deep ocean or to smoothing in the dataset, then two columns of 𝗘*

_{l}*may be nearly identical and the matrix ill conditioned. Similarly, if tracers closely covary, the matrix 𝗘*

_{l}*will have rows that are nearly linearly dependent and 𝗘*

_{l}*will have at least one very small eigenvalue. This situation is naturally handled by using a weighted and tapered least squares method where one wishes to minimize a combination of the squared, weighted misfit and the magnitude of the solution*

_{l}**m**

^{T}

**m**. With tapering, it is possible to achieve a smoother solution with a smaller formal uncertainty at the expense of a small decrease in how well the data is fit.

Two issues relating to the local steady-state assumption at the top and bottom of the ocean need to be addressed. First, the ocean surface changes dramatically with the seasons, and properties at the seasonally maximum mixed layer depth are more indicative of the properties transported into the interior (Stommel 1979; Williams et al. 1995). We therefore define all points above the maximum mixed layer depth, calculated from a seasonally varying climatology (Conkright et al. 1994), to be perfectly well mixed in the vertical. By taking this step, we also account for the known summertime bias in the surface layers of the WOCE climatology. The second issue is that the coarse spatial and temporal resolution of the data does not resolve deep overflow regions, bottom currents, or connections between fracture zones, which becomes central to the analysis in the next section.

### b. Global inversion step

We now turn to the second step: making the connection between the already collected local information and global pathways. Using Eq. (1) with *N* = 6 neighboring sources and rearranging the terms, we find a relationship between each tracer *C* at location (*i*, *j*, *k*) and its neighbors,

where this equation holds for all interior points *I*, provided that the steady-state assumption holds. In our solution method, the *m* and Δ*C* are solved in the foregoing local inversion step so that we now have a set of constraints on the global tracer field.

If we append Dirichlet boundary conditions at all boundary locations,

the combination of boundary conditions and interior equations [Eq. (15)] is then

where **c** is a vector made up of the global tracer distribution and **d** is constructed by the following rule: if an element of **d** corresponds to a row of 𝗔 at an interior location (*i*, *j*, *k*), then *d*_{ijk∈I} = Δ*C _{ijk}*. If the element corresponds to a boundary, then . Later, we will define exactly what is meant by boundary and interior points.

It should be noted that Eq. (17) is similar in form to the matrix method of Khatiwala et al. (2005), both representing the interaction between neighboring boxes. However, the matrix of Khatiwala et al. (2005) is derived from a time-evolving numerical model and represents model rates of transport, whereas Eq. (17) is derived from observations and indicates oceanic transport pathways but not rates of transport.

Insomuch as the matrix 𝗔 holds, it predicts the interior distribution of a tracer field, given knowledge of the boundary conditions. For example, the distribution of potential temperature follows from *θ̃* = 𝗔^{−1}**d**_{θ}, where **d**_{θ} is a vector filled with the potential temperature boundary conditions and zero elsewhere. The tilde indicates an estimated quantity, and the inverse of 𝗔 has been assumed to exist (the existence of the inverse will be further discussed later). For phosphate, the equation is PÕ_{4} = 𝗔^{−1}**d**_{p}, but **d*** _{p}* now has nonzero terms in the interior corresponding to the sources of phosphate in the ocean interior.

The 𝗔 matrix permits calculation of an interior tracer field from the surface and, therefore, allows decomposition of the World Ocean into distinct surface water types. The fraction of water that is sourced from a particular boundary region is calculated as if the relevant boundary patch was dyed. Setting the right-hand side vector **d*** ^{b}* to one in the boundary region of interest, , and zero elsewhere because mass is a conservative quantity, gives the fraction of water originating from the boundary patch through inversion of Eq. (17):

**g**

*= 𝗔*

^{b}^{−1}

**d**

*. The resulting dye field,*

^{b}**g**

*, is a form of impulse response or boundary Green function (Stammer and Wunsch 1996; Hall and Haine 2002), which has values bounded by zero and one.*

^{b}Note that 𝗔 depends on a multitude of local calculations and should be viewed as a linearization of the more complicated nonlinear pathways problem. In a one-step solution technique, all of the variables in Eq. (15) would be solved simultaneously, giving a highly nonlinear equation in which unknowns multiply each other. Methods for solving nonlinear inverse problems have been discussed in the literature of inverse methods (e.g., Tarantola and Valette 1982) and oceanography (e.g., Mercier 1989; Huybers et al. 2007).

### c. Boundary conditions

Previously, we made a distinction between the ocean interior and boundaries, but we did not define what was meant by boundaries. One could consider all oceanic boundaries—top, bottom, and side—as potential sources of unique water masses. Water mass transformation owing to air–sea fluxes and upper-ocean processes makes the sea surface an essential boundary in the problem. At the seafloor there is also some transformation, for example due to geothermal heating and resuspension of ocean sediments, although these sources are expected to be small relative to the surface sources. We will refer to the decomposition of the World Ocean using top, bottom, and side boundaries as the “all boundary” solution.

The all-boundary solution, however, is difficult to relate to standard notions of water masses that are primarily associated with surface formation regions. Thus, we also provide a “surface boundary” solution. If we assume that sea surface properties define all water masses and that water mass transformation at the seafloor is primarily due to internal processes such as entrainment and mixing, then the boundary locations of Eq. (17) can be defined as the surface only. An issue that we must contend with in calculating this surface-boundary solution is that the interior tracer equation (15) does not hold everywhere along the seafloor, as seen by inspection of the WOCE climatology. At the bottom of the Weddell Sea, for example, there is a local minimum of temperature, which is an obvious violation of steady-state balance. The dataset does not resolve continuous property tongues between the Weddell Sea shelf and the bottom of the Weddell Sea, even at ½° horizontal resolution, whereas other observations indicate a direct connection through a small property tongue along the seafloor (Warren 1981). This feature may be missing owing to a lack of spatial resolution or because it is episodic in time.

It is useful to develop a measure of the influence of local imbalances of Eq. (15) on the estimated global tracer distribution. To quantitatively define the goodness of fit of the estimated tracer fields, we introduce the global cost function: , the sum of squared differences between the estimated and observed tracer values, where 𝗪* _{g}^{i}* is a weighting by the inverse of the squared observational error. For each tracer, the contribution to the global cost function from the local inversion error

**n**

*can be calculated by*

_{l}*J*= (∂

_{g}*J*/∂

_{g}**n**

*)*

_{l}^{T}

**n**

*. In other words,*

_{l}*J*is the sum of terms coming from each location, , where the partial derivative can be conveniently calculated by a series of matrix manipulations, (∂

_{g}*J*/∂

_{g}**n**

*) = 𝗔*

_{l}^{−T}𝗪

*𝗔*

_{g}^{−1}

**n**

*. Each term is expected to contribute a value of one to*

_{l}*J*, and the total expected value of

_{g}*J*is the number of global points multiplied by the number of tracers. If

_{g}*J*is larger than the expected value, we determine the smallest number of the local contributions that must be set to zero so as to reduce

_{g}*J*to its expected value, and we label these locations as being inconsistent with the global tracer distribution.

_{g}We hypothesize that the global inconsistencies with the steady-state assumption are primarily due to missing or unresolved connections between bottom boundary points, such as was seen at the bottom of the Weddell Sea. At bottom points, which give the largest contribution to *J _{g}*, we allow a connection to other bottom boundary points that are in the vicinity beyond the immediately adjacent bottom points. We employ this parameterization of bottom flows as sparingly as possible while reducing

*J*to the expected value. This bottom spreading parameterization decreases

_{g}*J*by making the first-step inversion error smaller and by stabilizing the global matrix 𝗔, as will be discussed in more detail later.

_{g}This strategy for obtaining the surface-boundary solution requires a modification to the first step of the inversion discussed earlier. To account for unresolved bottom spreading, we allow these select bottom locations to exchange with other seafloor boxes within a specified horizontal radius, beginning at 2°, working up to 16°, until the properties are explained. This is accomplished by modifying the definition of sources that compose the matrix 𝗘* _{l}* [Eq. (14)]. Although this problem is underdetermined because of the availability of more sources than tracers, the ambiguity is handled by the same tapered, weighted least squares formulation as the earlier local inversions. In particular, we seek a solution that explains all of the observed tracers with a combination of source properties that is as smooth as possible, with preference given to source locations with a similar density as the observed point. This preference is specified by minimizing an extra term, (

**m**−

**m**

*)*

_{o}^{T}𝗦(

**m**−

**m**

*), where*

_{o}**m**

*is our a priori desired smooth solution and 𝗦 is a weighting matrix (see appendix B for more details). The concept is similar to the maximum entropy method of Holzer et al. (2010), and solution methods are given by Tziperman and Hecht (1988).*

_{o}## 3. Application of TMI to the World Ocean

TMI permits the tracing of ocean pathways under the assumption that the ocean is in steady state, but it remains to be seen to what extent it can be applied to tracer climatologies meant to capture a time-average picture of the ocean over the last few decades. In particular, the WOCE climatology used here (Gouretski and Koltermann 2004) is primarily a compilation of hydrographic transects from the 1990s, with some additional data going as far back as the 1950s. Although we refer to the climatologies as the “data,” they are the result of a complex averaging function of temporally and spatially sparse observations, which demanded careful quality control and mapping in its construction.

We first construct the local matrix 𝗘* _{l}* at every location in the ocean interior, which describes the relationship between neighboring points using the tracer observations described in section 2. The matrix 𝗘

*is weighted by a matrix 𝗪, constructed according to the vertically varying error profile of Gouretski and Koltermann (2004, Fig. 27). Typical standard deviation values of the noise terms are*

_{l}*σ*(

*η*) = 0.1°C for potential temperature,

_{θ}*σ*(

*η*) = 0.01 for salinity,

_{s}*σ*(

*η*

_{O}) = 0.2% for

*δ*

^{18}O,

*σ*(

*η*) = 0.05

_{p}*μ*mol kg

^{−1}for phosphate,

*σ*(

*η*) = 1.0

_{n}*μ*mol kg

^{−1}for nitrate, and

*σ*(

*η*) = 5.0

_{o}*μ*mol kg

^{−1}for oxygen. As the published errors were given with ½° resolution, we decrease the uncertainty in the averaged 2° resolution value according to the published spatial covariance of Gouretski and Koltermann (2004). Because we are comparing neighboring boxes that necessarily covary owing to their proximity, 𝗪 is properly constructed as a square matrix with nondiagonal elements following the method of Bennett (1992) and Gebbie (2004, p. 56).

The conditioning of the local matrix inversions can be gauged by the magnitude of singular values of the product 𝗪𝗘* _{l}* (e.g., Wunsch 1996). As this is a weighted matrix, the singular values reflect the ratio of the singular values of 𝗘

*to the noise magnitude. The smallest singular value of 𝗪𝗘*

_{l}*is sometimes as small as 0.05–0.1 or, equivalently, that the noise magnitude is typically 10–20 times larger than the smallest singular value. The large errors in the*

_{l}*δ*

^{18}O field relative to its water-mass signal are responsible for most of the small singular values. In this case, it is possible to overfit the observations because the rows of 𝗘

*do not, in practice, give independent constraints. As previously mentioned, this situation is naturally handled by using a weighted and tapered least squares method. A global value of the tapering parameter,*

_{l}*α*= 1, is found using the L-curve method of Hansen (1992), although reasonable values of

*α*vary over one or two orders of magnitude depending on subjective judgment. At a few test locations in the interior, we use a standard least squares formula (see Wunsch 1996, p. 124) and find that the addition of tapering decreases the estimated solution uncertainty by an order of magnitude. The degree of smoothness of the solution is affected by

*α*, but the overall structure of the solution—including the aggregated amounts of water derived from different regions—is insensitive.

### a. All-boundary solution

The information from the first step of the inversion is put together to form the global matrix 𝗔, with Dirichlet boundary conditions applied at the ocean floor and sea surface. For the 4° resolution datasets with 33 vertical levels, 𝗔 has a size of 74 064 × 74 064. At this dimension, it is difficult to determine the full spectrum of eigenvalues of the matrix, but using an *LU* factorization we confirm that 𝗔 has full rank and may be inverted. In experiments for which the local tapering parameter is set to zero (i.e., *α* = 0), we find that the global matrix drops rank and is not invertible. The lack of invertibility is due to isolated patches of homogeneous water in which the TMI method finds no traceable path back to a boundary. Tapering remedies this problem by essentially stipulating that diffusive processes will eventually connect the boundaries to all interior points, a reasonable assumption since this is an equilibrium solution.

Most of the required computation is to obtain 𝗔. Once 𝗔 is found, it is computationally inexpensive to get different water mass decompositions or to calculate equilibrium tracer fields. Using the same *LU* factorization as before, the product of 𝗔^{−1} and any vector is efficiently calculated. The effective inversion of 𝗔 takes less than one second on a typical workstation after the *LU* factorization has been stored.

The assumption of steady state has been made on a provisional basis. Recent evidence of warming (e.g., Levitus et al. 2000; Gouretski and Koltermann 2007) and of salinity changes of both signs (e.g., Wong et al. 1999; Bindoff and Mcdougall 2000; Curry et al. 2003; Boyer et al. 2005) would make it surprising for a steady-state solution to hold perfectly everywhere. Thus, it is useful to analyze the misfit between the best-fit steady-state tracer distributions and observations for the all-boundary solution. A method for finding the minimum value of the global cost function *J _{g}* is detailed in appendix C. Normalizing by the expected value for each contribution, we find global

*J*values of 0.34 for temperature, 1.46 for salinity, 0.02 for

*δ*

^{18}O, 0.32 for phosphate, 0.67 for nitrate, and 1.22 for oxygen. Because these values are near or below 1, the estimated steady-state tracer fields are consistent with the observations and their uncertainty. Closer inspection of the misfits (Fig. 2) shows that there are regional patterns of differences, such as the TMI solution giving temperatures that are 0.2°–0.4° too warm in the intermediate-depth ACC region. While this is consistent with a recent warming in the sea surface in the ACC region, other scenarios may also explain the discrepancy and, at this point, we simply note that the misfit is small and is within the expected observational uncertainty.

### b. Surface-boundary solution

To obtain the surface-boundary solution, the matrix 𝗔 is reconfigured with Dirichlet boundary conditions at the sea surface only. At the seafloor, we start by assuming that local steady-state balance holds. In this case, we find that *J _{g}* is many orders of magnitude larger than the acceptable value, as was expected given the obvious local extrema at the seafloor. For example, the temperature field, calculated from

*θ̃*= 𝗔

^{−1}

**d**

_{θ}with the surface temperature set to the observed value, is 4°C warmer than observations at depths greater than 2000 m, which is obviously much larger than the observational error. There are large misfits in the other tracers as well.

To account for the seafloor errors, the contribution to *J _{g}* from each interior point is calculated via the sum , where the partial derivative is linearized about the first-guess solution described in the previous paragraph and

*N*equals the number of geographic points. We implement a bottom spreading parameterization for the 500 locations that give rise to the largest global misfits—then update the global solution. This procedure is repeated iteratively with improved solutions until the global cost function is reduced to a level that indicates consistency between the modeled and observed tracers. For the WOCE dataset, we find that approximately 2800 bottom locations are sufficient, out of the total of 17 740 bottom boundary points and 74 064 total points. These locations do not necessarily correspond to locations where the local steady-state misfit

**n**

*is large. In regions with nearly homogeneous tracer concentrations, the partial derivative term in the global cost function becomes large. These bottom points are located along pathways that wrap back upon themselves, thus amplifying any imbalance. After admitting sources along the seafloor in a horizontal radius up to 16°, the local misfit is reduced but, more importantly, the amplified pathways are short circuited, leading to stabilization of the 𝗔 matrix.*

_{l}The 100 bottom points that have the largest contributions to *J _{g}* are primarily at a depth greater than 4000 m (Fig. 3). Many of the locations are near sites of bottom-water formation, such as the Ross Sea, and many others are near the western boundaries of ocean basins. The major North Atlantic bottom pathways are an exception to this general pattern. These points appear to be physically reasonable, as they may be needed to connect pathways from shelf waters to the deep via unresolved overflows and to connect deep ocean basins that are joined by unresolved fracture zones.

After reconstructing 𝗔 with the information from the 2800 special bottom locations, the global cost function is reduced close to the values found for the all-boundary solution. The normalized *J _{g}* values are 1.01 for temperature, 2.53 for salinity, 0.05 for

*δ*

^{18}O, 2.74 for phosphate, 3.45 for nitrate, and 1.23 for oxygen. The distribution of misfits compiled from the entire globe is consistent with the expected error distribution (Fig. 4). The resulting solution is in steady state in the sense that interior pathways are continuous, with the caveat of special connections between bottom points. Note that the temperature misfits are not perfectly Gaussian in structure, having a positive shift consistent with the need to adjust the interior ocean to consistently cooler temperatures so as to be reconciled with the data, offering an interesting avenue for future exploration. Overall, the consistency of the data and model indicates that the entire ocean interior can be described as a combination of surface water types.

The assumed stochiometric ratios are checked a posteriori by plotting the estimated ΔPO_{4} values versus the ΔNO_{3} and ΔO_{2} values, which show a strongly linear relationship with a spread consistent with the estimated errors (see Fig. 5). The estimated Δ values are defined as the difference between the observed nutrient value and a model estimate that assumes the nutrient field is conservative. By this definition, we find that the ΔPO_{4}:ΔO_{2} ratio is −161 and ΔPO_{4}:ΔNO_{3} ratio is 15.4; thus, both ratios are closer to zero than assumed in the model formulation (see slopes in Fig. 5). These ratios appear reasonable when compared to a previous study (Hupe and Karstensen 2000) that estimated a ΔPO_{4}:ΔO_{2} ratio that varied regionally from −130 to −175. The positive residual for oxygen can then be understood as arising from observational constraints that change the ΔPO_{4}:ΔO_{2} ratio from that assumed in the model. The reduced value of the ΔPO_{4}:ΔNO_{3} ratio appears to be primarily due to denitrification in regions with ΔPO_{4} greater than one and low oxygen levels, such as the middepth Arabian Sea, off the west coast of North Africa, and the middepth eastern tropical Pacific, an effect not accounted for in our equations.

The posterior checks indicate that the model decomposition is consistent with the observations and, thus, a plausible representation of the ocean. By taking into account geographic constraints and many tracers, we expect that this decomposition is more complete and accurate than earlier estimates, but also note that it is unlikely to be unique, particularly given the need for a bottom-closure scheme. The implications of these results and their robustness will be further explored in the next section, where the range of possible water mass decompositions is calculated.

## 4. The spectrum of water types

A large number of diagnostics can be calculated from the foregoing surface-boundary solution, and we will focus on properties that permit comparison with other water-mass decomposition studies. Following previous works (e.g., Karstensen and Tomczak 1998), we distinguish between water types, having a specific set of tracer properties, and water masses composed of multiple water types. The complete global relationship matrix 𝗔 can be downloaded along with a MATLAB software package to compute the influence of any water type or water mass throughout the World Ocean.

The utility and novelty of this work is that any surface patch can define a water mass. Regional patches, such as in the southeast Pacific core of the ACC, have been hypothesized to contribute large quantities of intermediate-depth water (England 1995). Figure 6 illustrates how this contribution can be estimated using TMI and it confirms the northward spread of waters from this surface patch. There is a nearly unlimited number of regional studies that could be carried out, but our present intent is to introduce the method and to discuss relevant quantities at the largest scales.

### a. Water-mass decomposition

We begin by considering the volume of water that has originated in seven major surface regions, given the following names for reference: 1) Antarctic, 2) Subantarctic, 3) Arctic, 4) North Atlantic, 5) North Pacific, 6) Mediterranean, and 7) the remaining subtropical and tropical regions (Fig. 7). The Antarctic source region is defined as the surface region south of the southern extent of the Antarctic Circumpolar Current (Orsi et al. 1995), whose boundary is given by the *σ _{o}* = 27.55 line (i.e., surface density of 1027.55 kg m

^{−3}, Sievers and Nowlin 1984). A natural boundary between the Subantarctic region and the subtropics is the Subtropical Front, here defined as the 34.8 isohaline (Deacon 1937). In the Northern Hemisphere, the North Pacific and North Atlantic waters designate the subpolar gyre waters of their respective basins. In the Pacific, the Polar Front is well described by the 34.0 isohaline (Roden 1975), which we use as the boundary between the subtropics and North Pacific. In the North Atlantic, the 35.4 isohaline is used to mark the Polar Front (Tomczak and Godfrey 1994). The Arctic and Mediterranean regions are defined by their geography. We consider the Canadian Archipelago and regions up to 80°N in the vicinity of Greenland to be part of the North Atlantic region, not the Arctic, as shown in Fig. 7. Obviously, not all locations within each surface patch will contribute equally to the volume of the interior; for example, the North Atlantic patch includes regions of deep convection that will have a disproportionately large influence.

As discussed earlier, the fraction of North Atlantic Water is calculated as if running a dyed surface patch experiment. A new right-hand side vector **d**_{natl} is defined by setting the surface region of interest to one, , where *B*_{natl} is defined as the domain of all surface locations in the North Atlantic region and zero elsewhere. Equation (17) then gives **g**_{natl} = 𝗔^{−1}**d**_{natl}, where the resulting dye field is **g**_{natl}, the fraction of mass that originated from the surface location *B*_{natl}. This method provides an easily communicated definition of what is meant by North Atlantic Water: all water that was last in contact with the surface in the North Atlantic.

A latitude–depth transect of the Atlantic Ocean is computed by calculating the contribution from each of the seven surface patches and taking a zonal average of all ocean points between 60°W and 10°E. As anticipated, the Atlantic is dominated by waters of North Atlantic origin that are most abundant at a depth of 2500 m (Fig. 8). The water mass decomposition provides regional detail, such as the upwelling of North Atlantic Water into the Antarctic Circumpolar Current (ACC), as seen in the elevated fraction of North Atlantic Water rising toward the surface of the Southern Ocean. Outflow of Mediterranean Water (MED) between 1000 and 2000 m is expected, but the influence of this water is localized to north of the equator. The designated boundary between Antarctic and Subantarctic Waters (SUBANT) does not divide intermediate and bottom water perfectly, as some Antarctic Water (ANT) is seen at intermediate depth and some Subantarctic Water near the bottom, but this is not surprising given ocean mixing and that the sites of Subantarctic Water formation coincide with the ACC, which meanders in latitude around the globe. When the southern extent of the ACC is defined by the *σ _{o}* = 27.75 line instead of

*σ*= 27.55, the Antarctic contribution to intermediate water is diminished, but other artifacts appear in the Pacific Ocean so that

_{o}*σ*= 27.55 is the best global compromise.

_{o}Some of the strong gradients in the source fractions in the upper ocean are due to the definition of the water masses themselves and the specification of a perfectly well-mixed wintertime mixed layer. At the surface, we define water masses by their regional outcrop, and their water mass fraction will be 1. Outside of this region, the fraction is set identically to zero. At the water mass boundary there will be a drop from one to zero in all cases. Since the mixed layer is assumed to be perfectly mixed, the zero to one change extends to the depth of the maximum mixed layer.

In contrast to the Atlantic, the Pacific Ocean primarily consists of waters of Antarctic origin (Fig. 9). North Atlantic Water is most abundant near the seafloor in the North Pacific, with a fractional contribution of 25% or less, roughly consistent with Johnson’s (2008) estimates. Subantarctic Water fills a similar depth–latitude space in both the Pacific and Atlantic. Although North Pacific and Subantarctic Waters have similar hydrographic properties, salinity less than 34 and temperature between 2° and 6°C, the additional geographic information incorporated into this study indicates that these waters meet in the central Pacific but that North Pacific Water occupies a slightly shallower depth. Furthermore, the bulk of North Pacific Water resides in the North Pacific, whereas Subantarctic Waters cross the equator in greater abundance.

The local maximum of North Atlantic Subpolar Water (NATL) at the seafloor in the extreme North Pacific requires additional comment. Although NATL may be increasing toward the north in the deep Pacific, a possibility is that bottom waters have been in contact with the seafloor long enough for geothermal heating to make an imprint. In fact, bottom water may seem more NADW-like in property characteristics due to this heat source, previously noted for the North Pacific (Joyce 1986) and the tropical Pacific (Johnson and Talley 1997). Although this ambiguity cannot be resolved at this point, the TMI method makes specific predictions for where steady-state dynamics are not sufficient to explain the tracer properties and thus points the way for further investigation.

Waters of tropical origin account for a major fraction of the volume of the Pacific above 2000 m. Here, we do not distinguish between tropical water of Pacific and Atlantic origins, but the vast majority of water is from local tropical origin spreading downward. There is some signal of Atlantic Tropical Water in the Pacific, as this water first travels northward in the Atlantic and mixes with North Atlantic Water between 40° and 50°N. From this point on, the water is irreversibly mixed; thus, the signal of tropical water is mixed with North Atlantic Water throughout much of the global ocean.

The path of North Atlantic Water can be tracked globally by the results of TMI. In the Indian sector (Fig. 10), a ribbon of NATL is traced eastward from the southern tip of Africa, although the concentration of NATL is reduced from 50% to 5% across the Indian Ocean, consistent with the observed high-salinity plume at 2500-m depth. A significant contribution of Mediterranean Water is diagnosed in the north Indian Ocean, although further inspection shows that this water actually originates from the Red Sea and Persian Gulf. In the major water mass definitions used here, no distinction is made between the Mediterranean, Red Sea, and Persian Gulf water masses. As noted by a reviewer, the proportion of NATL along the seafloor increases toward the north. This increase mirrors the pattern of NATL in the North Pacific. In both basins, these bottom waters are among the oldest in the world, and their properties may be subject to influence by geothermal heating and sediment interactions.

### b. Globally integrated contribution of water masses

The globally integrated volume fraction of each regional water type further emphasizes the influence of Antarctic Water in filling the ocean (Fig. 7). Globally, Antarctic source waters account for 36% of the total volume. Together with Subantarctic Water, the Southern Ocean is estimated to contribute 56% of the total volume. The indication that the majority of the world’s oceans is filled from the south further supports the notion that the Southern Ocean is critical for determining the partition of carbon between the ocean and atmosphere between glacial cycles (e.g., Toggweiler 1999).

The relative importance of different subregions for filling the world’s oceans is also depicted in Fig. 7. More than half of all Antarctic Water originates in the Atlantic sector, emphasizing the relative importance of the Weddell Sea over other sites of formation around Antarctica, such as the Ross Sea. Besides the Ross and Weddell centers of action, there are smaller sites of water mass formation along other portions of the Antarctic coast. This picture generally agrees with the ventilation rates of Orsi et al. (1999), but additional work is necessary to convert the results of this study into water-mass formation rates. The Pacific sector is the greatest contributor to both Subantarctic and tropical waters, consistent with the Pacific sector encompassing a larger area than the Atlantic and Indian sectors. Another explanation for the increased formation of Subantarctic Waters in the Pacific sector is that the region off the coast of South America is an especially vigorous site of intermediate-water formation (England 1995). In the North Atlantic, the TMI solution estimates that 16% of the World Ocean originates from the Greenland–Iceland–Norwegian (GIN) seas and 9% from the Labrador and Irminger Seas and the Canadian Archipelago. Although it is has been suggested that water-mass formation rates are roughly equal in these two regions (Lab Sea Group 1998), relatively little is known about how much of these waters are exported to the global ocean. The estimates of this study are based on the integrated behavior of tracers, rather than noisy and difficult to obtain air–sea flux measurements, and thus may be more accurate over long time scales than previous estimates.

### c. Robustness of the decomposition

The sensitivity of the water mass decomposition is explored by applying TMI with additional errors added to the observational climatologies. Specifically, we add noise with the published variance and a Gaussian covariance length scale of 450 km in the horizontal, as was deemed appropriate by Gouretski and Koltermann (2004). The noise is in addition to that intrinsically in the observations and thus constitutes a more severe test than necessary for the model to perform adequately. In five water mass decompositions that are found with five unique noise realizations, the local steady-state assumption breaks down at an average of 60% of locations, much more than the roughly 4% failure rate in the original decomposition. Apparently, TMI is finely tuned to diagnose nonsteady-state behavior. The resulting water-mass decompositions also have more small-scale structure; for example, mass fractions of North Atlantic Water in the deep North Pacific vary by 10% in neighboring grid points. When making zonal averages, however, the water mass results are consistent across all decompositions within 5%. The standard deviation in zonal-mean North Atlantic Water along 30°N in the Pacific is 2% when computed across the five decompositions. The gross properties of the water mass decomposition found by TMI appear to be essentially insensitive to the magnitude of observational uncertainty. We cannot, however, rule out systematic or structural deficiencies in our solution, though we have no specific reason to believe that they exist.

## 5. Discussion and concluding remarks

Analyses of modern-day oceanic tracers have typically drawn conclusions based on just a handful of distinct water masses (e.g., Broecker et al. 1998; Johnson 2008). We have presented an inversion for the composition of the ocean based upon global temperature, salinity, *δ*^{18}O, phosphate, nitrogen, and oxygen observations (Gouretski and Koltermann 2004; LeGrande and Schmidt 2006) that, in combination with the geographic information associated with these observations, allows deep water properties to be tracked back to all oceanic surface regions. The inversion indicates that the deep ocean is composed of a continuous spectrum of source waters and that the surface regions of much of the Southern Ocean, including regions where the so-called intermediate water forms, are important for setting the properties of the global abyss. Waters from the North Atlantic make a secondary contribution to the deep Pacific, occupying less than 25% by volume.

It should be emphasized that the oceanic flow structure is neither large scale nor steady and that the pathway quantities determined here are the aggregate effects of the underlying advection and mixing occurring at length scales not resolved in our approach or, for that matter, in many numerical modeling studies. However, there is an important distinction between this observational study and numerical simulation of the circulation: here we observationally constrain the aggregate effects of advection and mixing upon the distribution of tracer properties, as opposed to attempting to simulate their transport through parameterization of unresolved processes. The net result of advection and diffusion can be constrained at macroscopic scales without the need to determine their individual or particular contributions, as is commonly done (e.g., Ganachaud and Wunsch 2000). In some sense, the results presented here can be viewed as a climatology of the ocean circulation pathways, where we have attempted to suppress the underlying time variability of mesoscale features.

Since there is time variability in the ocean surface properties and interior ocean pathways, one must call into question whether a single climatology of ocean water types and pathways is an adequate description of the ocean. While we view the results presented here as a step toward providing a more detailed and complete description, ultimately one also seeks a full estimate of the time-evolving water properties and flow structure. As the ocean memory extends over millennia (e.g., Wunsch and Heimbach 2008), a full description of the modern tracer distribution then requires consideration of processes acting over paleoceanographic time scales. Methods for handling water mass decompositions with time-evolving water mass properties have recently been investigated by Leffanue and Tomczak (2004) and Henry-Edwards and Tomczak (2006).

The present water mass decomposition does not constrain time rates of change. The results would be unaffected if, for example, the rates of transport and diffusion in the ocean were multiplied by some arbitrary factor, assuming the steady-state assumption holds. Nonetheless, the relative volumes of water masses, as well as the pathways along which they move, provide some insight into past and future climate. It may be possible to use the identified pathways between the surface and interior to better diagnose the penetration of heat and carbon into the oceans and to compare these against other estimates of the transient propagation of surface anomalies into the ocean interior. Furthermore, the interpretation of radiocarbon ages depends on an understanding of the mixing history of each water particle, which this technique provides in greater detail than previously available from observations. Coupling a detailed knowledge of radiocarbon age with pathway information may also permit the extraction of information regarding rates of flow.

## Acknowledgments

Thank you to Charmaine King for providing some of the tracer data, to Ed Hill III for converting data into netCDF form, and to CSIRO for hosting the *World Ocean Atlas 2005* data in NetCDF format. Helpful comments were provided by Wally Broecker, Mark Cane, Arnold Gordon, Samar Khatiwala, Jeff Severinghaus, Matthias Tomczak, and Carl Wunsch.

## REFERENCES

^{18}O of the glacial deep ocean.

**,**

**,**

**,**

**,**

**,**

**,**(

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**(

**,**

**,**

**,**

**,**

**,**(

**,**

**,**

**,**

**,**

**,**(

**,**

**,**

**,**

**,**(

**,**

**,**

**,**(

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**(

**,**

**,**

**,**

**,**

_{2}by ventilation of the ocean’s deepest water.

**,**

**,**

**,**(

**,**

**,**

**,**

**,**

**,**

**,**

### APPENDIX A

#### Tracer Extrema and a Steady-State Circulation

The spatially continuous equation for steady advection and diffusion of a tracer *c* in one dimension is

At a point *x _{o}* that is a local maximum or minimum of

*c*, the derivative ∂

*c*/∂

*x*vanishes. Equation (A1) cannot be satisfied with finite diffusion. Thus, no tracer extrema can exist in steady state.

On a discretized grid, the advection–diffusion equation can be approximated a number of ways. Using a centered-space discretization with velocity and diffusion defined on the box faces, Eq. (A1) is

where the cross-sectional area of the face has been eliminated throughout the equation. Now, *c*(0) can be expressed as a function of its neighbors, *c*(0) = *ac*(−1) + *bc*(1). Specifically, we find that

Given that mass is conserved, we get the following rules: *u*(−½) = *u*(½) and *a* + *b* = 1. Because of mass conservation, we can define one velocity, *U* = *u*(−½) = *u*(½), that is relevant to the one-dimensional problem. If *U* > 0, then *a* > *b* by (A3). Making the physical assumption that all diffusivities are positive, the coefficient *b* is greater than or equal to zero when

Thus, there must be enough diffusion relative to advection to maintain a nonnegative *b* coefficient. Because *a* = 1 − *b*, the condition for *a* ≤ 1 is identical to (A4). In summary, a discretized model of a steady-state circulation will not have any internal tracer extrema if the diffusion criterion is met. The diffusion criterion is somewhat more stringent than the corresponding criterion for the continuous case. Similar limits can be found for flows in two or three dimensions, which will also depend on the discretization and numerical schemes for handling advection and diffusion.

### APPENDIX B

#### Method of Total Inversion

In section 2, we solve the following matrix equation for **m**:

where Δ𝗘 and **n** both represent noise in tracer observations. The noise vector **n** accepts noise in all of the tracer equations, but mass conservation is perfect; therefore, Γ is necessary to keep the dimensions correct. Not all of 𝗘* _{l}* is uncertain, only the entries formed from tracer observations. Therefore, the matrix Δ𝗘 has a fixed structure,

The Δ*e* entries can be made into a vector Δ**e**, which is also unknown. Now, define a new vector **n*** to be the concatenation of **n** and Δ**e**, which gives the total number of so-called control variables to be solved. The original matrix equation is now nonlinear because unknowns multiply each other, and is now written as *L*(**m**, **n***) = **y**, where *L* is a nonlinear operator with the same number of rows as 𝗘* _{l}*. The method of total inversion minimizes the cost function,

*J*=

_{l}**n***

^{T}𝗪

**n***, subject to Eq. (13), where 𝗪 is a weighting matrix with the squared inverse of the expected observational error on the diagonal.

Here, we wish to solve a weighted and tapered least squares problem so as to deal with potential ill conditioning in 𝗘. Tziperman and Hecht (1988) originally solved a tapered nonnegative least squares problem with Δ𝗘 = 0 and Γ = I. Here, we append an extra term to the cost function, *J* = **n***^{T}𝗪**n*** + *α*^{2}**m**^{T}**m**, where *α* is a trade-off parameter. We use the large-scale nonlinear optimization routine of MATLAB to solve this total inversion problem with tapering.

In the case of relating the properties of an interior point to other bottom points within a given radius, the cost function has a more complete form: *J* = **n***^{T}𝗪**n*** + (**m** − **m*** _{o}*)

^{T}𝗦(

**m**−

**m**

*) in which*

_{o}**m**

*is our a priori desired smooth solution, 𝗦 is a weighting matrix, and the solution is found by the same MATLAB routine.*

_{o}### APPENDIX C

#### Calculating the Global Cost Function

The global cost function is

subject to global steady-state balance. The computation of *J _{g}*, however, involves the uncertain surface boundary conditions. If the uncertain part of boundary conditions is expressed as a control vector

**u**, Eq. (17) for tracer

*i*becomes

To find the surface boundary conditions that minimize *J _{g}*, the problem can be restated in the canonical form of quadratic programming for each tracer individually,

where

and inequality constraints, such as the freezing point of water, can be imposed.

The canned MATLAB program “quadprog” can generally solve this problem, but the dimension of the problem is too large to be computationally feasible in this case. Instead of specifying **H**, the Hessian, explicitly, a function that calculates the Hessian vector products is given to the routine, which makes the problem solvable in practice.

## Footnotes

*Corresponding author address:* Geoffrey Gebbie, Harvard University, 24 Oxford St., Cambridge, MA 02138. Email: gebbie@eps.harvard.edu