## Abstract

Three methods are analyzed for the design of ocean observing systems to monitor the meridional overturning circulation (MOC) in the North Atlantic. Specifically, a continuous monitoring array to monitor the MOC at 1000 m at different latitudes is “deployed” into a numerical model. The authors compare array design methods guided by (i) physical intuition (heuristic array design), (ii) sequential optimization, and (iii) global optimization. The global optimization technique can recover the true global solution for the analyzed array design, while gradient-based optimization would be prone to misconverge. Both global optimization and heuristic array design yield considerably improved results over sequential array design. Global optimization always outperforms the heuristic array design in terms of minimizing the root-mean-square error. However, whether the results are physically meaningful is not guaranteed; the apparent success might merely represent a solution in which misfits compensate for each other accidentally. Testing the solution gained from global optimization in an independent dataset can provide crucial information about the solution’s robustness.

## 1. Introduction

The design of efficient observing systems plays a key role in oceanography because measurements are difficult and costly to obtain. A variety of studies has simulated observing arrays in numerical models to assess the observing system’s performance or to address the fundamental principles of observing system design (see, e.g., Baehr et al. 2004, hereafter B04, for an overview). Most of these studies used trial-and-error adjustment of their array configurations (e.g., Bretherton et al. 1976; McIntosh 1987). Notable exceptions are Barth and Wunsch (1990) and Barth (1992), who analyzed the optimization of idealized cases. Few studies have been specifically directed at predeployment array design (Hackert et al. 1998; Hirschi et al. 2003; B04); their array design methods relied on physical intuition. In contrast to earlier studies, the present study combines array design directed at providing immediate support for a realizable campaign with a formal optimization of the simulated array. We compare three array design methods—those guided by physical intuition (heuristic array design), sequential optimization, and global optimization.

## 2. Data and methods

### a. Dataset

We analyze model output of the 1/3° Atlantic model of the Family of Linked Atlantic Model Experiments (FLAME) group, a hierarchy of Atlantic Ocean models (Dengg et al. 1999; Beismann and Redler 2003). The horizontal resolution is 1/3° in longitude and 1/3° × cos(*ϕ*) in latitude (*ϕ*). The model setup and the analyzed output are identical to the configuration and data used in B04. The analyzed time series span 20 yr, starting at 1 January 1980. The temporal resolution of the employed model output is 5 days.

### b. Simulated observing system

The simulated observing system is designed to allow continuous monitoring of the oceanic meridional overturning circulation (MOC) at a specific latitude. It is based on the monitoring strategy proposed by Marotzke et al. (1999): thermal wind and Ekman contributions to the MOC are measured separately, and the resulting meridional transports are corrected to ensure closed–mass balance over the longitudinal transect (Hirschi et al. 2003). Köhl (2005) and Hirschi and Marotzke (2007) showed that the thermal wind part and the Ekman transport are indeed the dominant contributions to the MOC in the North Atlantic. The recently deployed RAPID-MOC 26°N array is based on this monitoring strategy (Marotzke et al. 2002; Schiermeier 2004), with the additional use of continuous measurements of the western boundary current in the Florida Strait (Baringer and Larsen 2001).

The observing strategy was tested at different latitudes (B04), suggesting that this monitoring strategy would be able to capture the main features of the time mean and the variability of the MOC at 26°N, but not at 53°N. Figure 1 shows the resulting MOC reconstructions compared to the original (model) MOC at 1000 m for both latitudes, based on a simulated measurement at every grid cell, that is, the maximum number of profiles (*n* = *n*_{max}; *n*_{max} ≈ 200 for 26°N, and *n*_{max} ≈ 140 for 53°N). Each profile simulates a full-depth mooring, measuring temperature, salinity, and pressure at discrete depths.

### c. Differential evolution (DE)

Identifying a spatial array design with a minimal root-mean-square error (RMSE) poses a global optimization problem with integer constraints. The key challenge in global optimization is to reliably identify the best (global) optimum within feasible computation times. Currently, available global optimization algorithms differ considerably in their convergence speed and the quality of the identified solution (Athias et al. 2000; Moles et al. 2004; Ali et al. 2005). Previous studies analyzing the spatial design of ocean observing systems used simulated annealing or evolutionary strategies (Barth and Wunsch 1990; Barth 1992; Hernandez et al. 1995). Here we adapt the differential evolution algorithm (Storn and Price 1997). The original algorithm is relatively robust in achieving the true global solution with feasible computational requirements (Moles et al. 2003, 2004; Storn and Price 1997). We demonstrate the skill of the new algorithm to reliably identify the global optimum for a range of test problems.

Evolutionary optimization methods adopt the sequence of mutation and selection steps observed in nature (Goldberg 1989). The algorithms start by producing a random initial set of possible solutions (typically referred to as a population). In our example, population members are feasible array designs. The population members are evaluated using an objective function to determine their fitness. We define fitness as the negative root-mean-square error as we are interested in a minimal RMSE. A subset of well-performing population members is then used to produce a new population with a superimposed random variation. The random variability is akin to the mutation process in natural evolution. This sequence is iterated until the algorithm has converged. The original differential evolution algorithm is designed for an unconstrained problem with continuous variables. We impose two constraints such that moorings are unique and located in the model domain by adding a penalty function. We round the continuous variables in the algorithm to the nearest integer to represent the integer grid locations in this model analysis. We assess convergence by repeating the optimization step with different random initial conditions similar to McInerney and Keller (2008).

## 3. Array design

Here, we optimize the suggested array to monitor the MOC through minimizing the RMSE of the MOC time series at 1000 m. Focusing on a single MOC time series at a fixed depth reduces the dimensions of the optimization problem significantly, while largely ignoring the vertical structure of the MOC. We will come back to the latter in section 3d. We test the observing strategy both at 26° and 53°N. Of the available model output of 20 yr (cf. Fig. 1), we initially use 10 yr (sections 3a–e) and subsequently test if the obtained results are robust for the second decade (section 3f).

We use three different array design methods: initially, we briefly revisit the intuition-based array design; subsequently, both the sequentially optimized array design and the globally optimized array design methods are tested. Although we test different numbers of profiles (starting at *n* = *n*_{max}), the overall aim is to evaluate the locations of profiles for a smaller and logistically feasible amount of profiles; we therefore restrict the analysis to about 10 profiles.

### a. 26°N: Heuristic design

The heuristic array design (i.e., intuition-based placement of the profiles) was used in Hirschi et al. (2003) and B04. For the present analysis, we use the “standard” setup as derived in B04. The design of this setup was guided by two criteria: (i) the profiles should be placed preferentially in areas where the meridional velocities are assumed to be high, and (ii) the resulting array should cover as much of the section area as possible, that is, minimizing the remaining bottom triangles (B04). The resulting array setup consisted of nine profiles: four at the western margin to ensure a dense coverage of the steep slope, one at each side of the Mid-Atlantic Ridge (MAR) to ensure the coverage of the deep subbasins west and east of the MAR, and three at the eastern margin to ensure coverage of the gentle slope at this side of the basin (Fig. 2a). The RMSE between the model MOC and the reconstructed MOC based on the above-described setup is about 0.95 Sv (1 Sv ≡ 10^{6} m^{3} s^{−1}; Fig. 3).

### b. 26°N: Sequential optimization

Several studies have used an incremental approach for the design of observing systems (e.g., Rayner et al. 1996; Gloor et al. 2000; Patra and Maksyutov 2002). This approach has the advantage of being computationally efficient and is arguably a useful framework if the locations of an existing observing system are constrained. Aiming to achieve an optimal design, we start with a sequential optimization (i.e., finding the optimal placement for one profile at a time) in addition to an existing setup. The starting point is an extensive search for two profiles. The smallest RMSE between the model MOC and the reconstructed MOC of about 1.2 Sv is found when one profile is placed close to the western boundary and the second profile is in the middle of the basin east of the MAR (Fig. 4). Profiles are added sequentially to this setup, finding the location at each iteration with the minimum RMSE (Fig. 2b). The setup for *n* = 9 uses profiles evenly distributed over the transect, with the exception of the deep eastern boundary (Fig. 2b). The resulting RMSE decreases for a higher number of profiles, but even for nine profiles it is above the RMSE reached for the heuristic design (Fig. 3).

### c. 26°N: Global optimization

The underlying optimization problem is nonconvex (cf. Fig. 4), which requires the use of a global optimization technique. First, we test the differential evolution algorithm against the true global solution. The DE algorithm does recover the global optimum for *n* = 2, 3, 4 (Fig. 3), that is, the cases in which it is computationally feasible to test this.

For the global optimization, the RMSE decreases with higher numbers of profiles and converges to the solution with the maximum number of profiles (*n* = *n*_{max}), with an RMSE of about 0.4 Sv at *n* = 8 (Fig. 3). In contrast to the sequential optimization, the DE method favors profiles at the boundaries, particularly the western boundary (Fig. 2c), at the expense of profiles close to the MAR. All solutions *n* = 3, . . . , 9 include the shallow part of the western boundary (Fig. 5), which is entirely missed by the sequential array design.

### d. 26°N: Vertical profiles

So far, only the RMSEs between the model MOC and the reconstructed MOC at the fixed depth of 1000 m are considered, showing that the mean value and variability can be captured, depending on the specific array design (Figs. 6 a–c). However, the deep return flow is missed for most setups (Figs. 6 d–f). To account for the missing return flow quantitatively, we compute the RMSE for all depths (Fig. 7). The most striking result is that all array design methods outperform—in terms of the RMSE computed for all depths—the *n* = *n*_{max} setup while using only a small number of profiles (Fig. 7). The *n* = *n*_{max} setup underestimates the mean strength of both the northward and southward flows. The global optimization, in contrast, finds setups in which the bias in the northward flow is reduced, while the heuristic design finds setups in which the bias in the southward flow is reduced by observing the western boundary more intensively. The reconstruction for both the global optimization and the heuristic design relies in part (i.e., for about 2 Sv in the northward or southward flow) on a fortuitous overestimate of the flow, gained from an incidental imbalance delivered by the chosen subset of profiles.

### e. Array design at 53°N

We repeat the application of the three array design methods at 53°N, a latitude where the method in its basic setup generally did not succeed in capturing the mean value and variability of the MOC (B04). Again, the heuristic array design aimed to cover most of the meridional velocities as well as most of the transect area (Fig. 8a). The RMSE of the resulting setup is about 4 Sv (Fig. 9). Both the sequential and the global optimization array design approaches result in a considerably smaller RMSE (Fig. 9). The sequential array design does not take into account the western boundary (Fig. 8b), where the highest southward velocities occur (cf. Fig. 2c in B04). The DE-based array design includes the entire transect, but large bottom triangles are left out (Fig. 8c). Note that the sequential and global optimizations result in a considerably smaller RMSE than the setup for *n* = *n*_{max} (Fig. 9), because of a coincidental balance of overestimates and underestimates in the meridional transports, that is, not representing the full dynamics of the meridional velocity field.

### f. Analysis of a second decade

Having analyzed the first decade of the employed dataset, we test whether the obtained results are robust for the second decade. This approach is akin to an out-of-sample validation. For both latitudes, we take the profile locations of the three different array design methods and compute the resulting RMSE. At 26°N, the RMSE between the model MOC and the reconstructed MOC at a 1000-m depth are similar for the first and second decades. The RMSE for the DE-based array design and the heuristic array design is nearly identical, while the RMSE for the sequential array design increases by about 0.2 Sv. At 53°N, in contrast, the RMSE for the optimal and sequential array design methods nearly doubles, but it is still lower than the RMSE for *n* = *n*_{max}. For the sequential array design, the RMSE in the second decade is lower for *n* = 2 than for *n* ≥ 3. For the heuristic array design, the RMSE increases from about 5 Sv in the first decade to about 6.5 Sv in the second decade; both values are above the respective RMSE for *n* = *n*_{max}.

## 4. Discussion

We test three different array design methods for a suggested monitoring strategy of the MOC. Testing the profile locations derived from analyzing the first decade over the second decade allows one to test whether the suggested array setup is robust with respect to an independent time series and to see if the results from the first decade are influenced by the noise in this time series. Our analysis suggests that both the results from the global optimization and the sequential array design are robust at 26°N, but not at 53°N. The results at 53°N are therefore of limited use, as are the physical insights gained from the analysis of the results. The results gained from analyzing 53°N do, however, allow for the immediate conclusion that the monitoring strategy itself has to be applied with great care. In contrast, the results of the heuristic array design are robust for both latitudes, and for 53°N, physical insight is needed to interpret the result.

The results of the global optimization and the heuristic array design at 26°N are similar for a feasible number of profiles (e.g., *n* = 9; Fig. 2) but are not identical. They mainly differ with respect to placement of profiles close to the MAR. While the global optimization favors profiles at the western boundary, the heuristic array design includes two profiles at each side of the MAR to monitor the subbasins to the east and to the west separately, ensuring that a potential pressure drop across the ridge is covered. Although it is known that these two profiles close to the MAR have little influence on the model reconstruction (Marotzke et al. 2002), they were included for dynamical reasons. The model’s ability to accurately reproduce real ocean dynamics is limited at this point: in FLAME, the core of the deep western boundary current lies above the depth of the crest of the MAR, while observations show that the core of the deep western boundary current reaches greater depths (e.g., Lee et al. 1996). Therefore, the results of the heuristic design (*n* = 9) should be compared to the results of the global optimization for *n* = 7 (Fig. 5f). The RMSE for the time series at 1000 m is still considerably smaller when the global optimization technique is used (Fig. 3); however, the RMSE of the full vertical structure is of comparable magnitude (Fig. 7).

We find that the RMSE that can be achieved by observing at the upper limit (every grid point, *n* = *n*_{max}) at 26°N is about 0.4 Sv (Fig. 3). The quality of the reconstruction for *n* = *n*_{max} is closely approximated by the globally optimized array design with less than 10 profiles. However, the RMSE for the full vertical structure for *n* ≥ 4 is smaller than the RMSE for *n* = *n*_{max} (Fig. 7). The results of the global optimization should be treated with caution, because it can identify array designs with reconstruction errors below the values achieved by *n* = *n*_{max}. This property points to problems introduced by purely optimizing a signal-to-noise ratio, an approach used in many detection studies. Here, the velocity field gained from the global optimization does not represent a dynamically meaningful subset of the full velocity field, and in turn, results derived from this subset are not representative of the full dynamics.

Note that the heuristic array design (for *n* = 9) achieves an RMSE in the vertical that is smaller than the RMSE for *n* = *n*_{max}. The same is true for the globally optimized array design. While the global optimization array design misses most of the southward flow and captures the variability at 1000 m almost precisely, the heuristic array design captures about half of the southward flow and does not capture the variability as well as the global optimization technique (for *n* = 9). Although it would be desirable to constrain the global optimization to include the mean value and variability of the southward flow (i.e., the RMSE between the original and reconstructed southward flow), such an optimization would be of limited physical meaning because all methods already do better than what is achieved with *n* = *n*_{max}.

Whether the results of a global optimization approach are applicable to a real observing array depends as much on the setup of the optimization as on its subsequent physical interpretation. We show here that global optimization is feasible and can—for the specific question at hand—immediately yield valuable information on profile placement. The global optimization provides no substitute for an in-depth understanding of the physical mechanisms behind a proposed monitoring array but can considerably facilitate the process of predeployment array design and point to potential methodological problems. This opens the prospect of applying global optimization to test potential observing strategies in numerical models, when the intuition-based array design is not readily derived; the underlying physics are understood well enough to test whether the result of the optimization is correct for the right reasons.

## 5. Conclusions

Based on our analysis of a simulated MOC observing system at 26° and 53°N in the FLAME model, we conclude that

sequential optimization does not improve heuristic array design;

global optimization can recover the true global solution for the analyzed array design;

at locations where the proposed monitoring strategy does not have the ability to reproduce the MOC at 1000 m (i.e., 53°N), global optimization finds profiles with lower root-mean-square errors than the heuristic design, but the suggested setup is not physically meaningful;

at locations where the proposed monitoring strategy has the ability to reproduce the MOC at 1000 m (i.e., 26°N), global optimization has the potential to yield results of comparable quality to the heuristic array design; however, whether the results make physical sense is not guaranteed—apparent success might merely represent an optimal solution in which misfits compensate for each other accidentally; and

the solution gained from global optimization should be verified in an independent dataset (e.g., by dividing the dataset) to ensure the solution’s robustness.

## Acknowledgments

We wish to thank Joël Hirschi for stimulating discussions. Felix Landerer and an anonymous reviewer provided helpful comments on the manuscript. We thank the FLAME group for providing output from their model, and Jens-Olaf Beismann and Lars Czeschel for their help with the model output. This work was supported by the Max Planck Society (JB, JM) and the National Science Foundation (KK, DM; SES 0345925). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the funding agencies.

## REFERENCES

**,**

**,**

**,**

**,**

**,**

**,**

**.**

**,**

**,**

**,**

**,**

**,**

**.**

**,**

**,**

**,**

**,**

**,**

**,**

_{2}surface source inversion.

**.**

_{2}observing network for constraining sources and sinks.

**,**

**,**

**,**

## Footnotes

* Current affiliation: Department of Geophysical Sciences, University of Chicago, Chicago, Illinois

*Corresponding author address:* Johanna Baehr, Massachusetts Institute of Technology, Dept. of Earth, Atmospheric and Planetary Sciences, Room 54-1517, 77 Massachusetts Ave., Cambridge, MA 02139-4307. Email: baehr@mit.edu