## 1. Introduction

The past decade has witnessed dramatic advances in our ability to observe the oceans. Prior to the 1990s, ocean observations relied almost exclusively on in situ surface and subsurface measurements taken by ship-borne devices such as bottle casts or CTDs. Today, most ocean observations are gathered by satellite-borne remote sensors, supplemented by autonomous in situ measurement systems such as Argo floats.

Observing system simulation experiments (OSSEs) are intended to guide the design of specific observing systems and the configuration of multisensor observing networks, provide an assessment of the potential for future observing systems and innovative uses of existing systems to achieve major improvements in forecast skill, test advanced data assimilation methods, and to assess the relative role of observations and forecasting methods in improving the utility of forecasts.

For some time now atmospheric models have been used to aid the design of observing systems (e.g., Kuo et al. 1998; Bishop et al. 2001). The increasing importance of OSSEs in meteorology has recently been highlighted by a new major international atmospheric research program under auspices of the World Meteorological Organization [e.g., The Observing-System Research and Predictability Experiment (THORPEX) Implementation Plan; International Core Steering Committee for THORPEX 2005]. Few applications of OSSEs exist in physical oceanography. McIntosh (1987) worked on the optimal design of tide gauge arrays, Hackert et al. (1998) discuss the optimal design of a tropical Atlantic mooring array, Hirschi et al. (2003) test options for the design of a monitoring array for the North Atlantic meridional overturning, and Barth and Wunsch (1990) applied a simulated annealing method for the configuration of an acoustic tomography array.

The purpose of this paper is to contribute to the discussion, planning, and implementation of a surface mooring array in the tropical Indian Ocean (TIO; CLIVAR–GOOS Indian Ocean Panel and Coauthors 2006). The mooring array is intended to monitor basin-scale, oceanic structure relevant to a range of time scales involved in climate variability and change. The design of the proposed array (Fig. 1) is based on “experience and expertise, taking into account present-day knowledge of the role of the ocean in the climate system and in particular published information on the location of the most energetic oceanic signals within the basin; the spatial/temporal propagation of signals; the phenomenological features known to play a role in the climate system; and coupled model studies pointing to a need for new ocean information.” Several other studies have been undertaken to contribute to the design of the TIO mooring array; Ballabrera-Poy et al. (2007) use a reduced-order Kalman filter to determine an optimal array for mapping sea surface height and sea surface temperature in the TIO, and Vecchi and Harrison (2007) present results from a system of OSSEs using a high-resolution ocean model and consider the ability of an integrated observing system, including Argo observations, XBT lines, and the proposed array to resolve subseasonal and interannual variabilities.

The fast, upper-ocean variability associated with intraseasonal disturbances and the variability of the monsoons is most important for societal impacts and has to be adequately observed. A basin-scale array of surface moorings, flux reference sites, and direct current measurements is intended to supplement both the satellite observations and the available float, drifter, and XBT programs by providing continuous time series of mixed layer and subsurface properties. Using the current version of the proposed observing system as a starting point, we focus on the assessment of the mooring array in the TIO and on variables that describe the intraseasonal variability and seasonal to interannual variations in this region. The necessity for a moored array in the TIO arises because recent discussion has shown that the Indian Ocean plays an active role in the global climate system on various time scales, for example, the discovery of the interannual Indian Ocean dipole (IOD; Saji et al. 1999; Webster et al. 1999) and the importance of intraseasonal oscillations for determining the onset and intensity of the Asian and Australian monsoons (Webster et al. 1998). A mooring array is also necessary because unlike most other observing platforms, a moored system is potentially capable of capturing all time scales of interest in the TIO, ranging from intraseasonal to climate change time scales.

While the current design displayed in Fig. 1 is based on our present-day understanding and experience, some decisions were made on the basis of practical logistics, economy, and even intuition (CLIVAR-GOOS Indian Ocean Panel and Coauthors 2006). The OSSEs presented in this paper are intended to supplement the design process by testing if the key oceanic processes in the model are adequately resolved by the proposed array, and are intended to identify any significant gaps that may be present in the proposed array.

The paper is organized as follows. Details of the ocean circulation model, the analysis system, and the method for objectively determining the optimal locations of a mooring array are described in section 2, results from a series of OSSEs are presented in section 3, followed by a discussion in section 4 and our conclusions in section 5.

## 2. Methodology

### a. Model configuration

The analyses described in this study are based on results from a global configuration of the Modular Ocean Model (Pacanowski 1995). The model configuration is described in detail by Schiller and Godfrey (2003). Briefly, the model has constant zonal resolution of 2° and enhanced meridional resolution of 0.5° within 8° latitude of the equator that gradually increases to 1.5° toward the poles. There are 25 vertical levels with 7 levels in the top 100 m. This version of the model includes the hybrid mixed layer model described by Chen et al. (1994). The model is initially spun up for 20 yr using climatological winds and strong relaxation to the monthly Reynolds sea surface temperature (SST; Reynolds and Smith 1994) and the monthly mean sea surface salinity (Levitus et al. 1994). Subsequently, the model is run for 12 yr, spanning 1982–94, and is forced by 3-day-averaged wind stress from a blend of National Centers for Environmental Prediction–National Center for Atmospheric Research (NCEP–NCAR) fields (Kalnay et al. 1996) and the Florida State University climatology (Legler et al. 1989; Stricherz et al. 1992). In this application, the model is coupled to a modified version of the atmospheric boundary layer model described by Kleeman and Power (1995), providing the model with realistic surface fluxes of heat and freshwater. A flux correction is also applied for heat and freshwater using fluxes derived from the last 5 yr of the spinup period. Results from this model have been used to explore intraseasonal variability (Schiller and Godfrey 2003), upper-ocean dynamics (Schiller et al. 1998), and interannual dynamics (Schiller et al. 2000) in the Indian and Pacific Oceans. These studies have undertaken extensive comparisons with observations including comparisons of time series of subsurface currents in the central Indian Ocean (Schiller and Godfrey 2003), comparisons between modeled and observed sea surface temperatures in the low-latitude Indian and Pacific Oceans (Schiller et al. 2000), and validation of the model’s surface heat fluxes (Schiller et al. 1998) that represents an assessment of both the ocean’s upper-ocean dynamics and the atmospheric boundary layer model. Additionally, Schiller (1999) presented a series of comparisons between modeled and observed subsurface temperatures along frequently repeated XBT lines in the Indian Ocean. The extensive model validation referred to above was typically very favorable. This demonstrates that this model does a good job of providing a realistic representation of the intraseasonal to interannual variability of the real world.

### b. Analysis system

**w**

^{a}is an

*n*-dimensional column vector of mapped fields and

*n*is the number of grid points;

**is an**w

*n*-dimensional column vector that is the temporal mean; 𝗠 is an

*n*×

*m*matrix where

*i*th column of 𝗠 is the

*i*th dominant EOF,

**w**

^{EOF}

_{i}, and

*m*is the number of EOFs used in each application; and

**c**is an

*m*-dimensional column vector that contains the amplitudes, or weighting coefficients, of the EOFs.

**w**

^{a}for a single point in time, the coefficients

**c**must be determined. This is done by performing a least squares fit of an array of observations to the EOFs. Formally, this is achieved by calculating the least squares solution to the linear system,

**c**, where 𝗛 is an operator that interpolates from grid space to observation space, so that the

*i*th column of 𝗛𝗠 is the

*i*th EOF interpolated to the observation locations;

**w**

^{o}is a

*p*-dimensional column vector of observations with the temporal mean removed; and

*p*is the number of observations used.

This analysis system is a simple form of an ensemble-based data assimilation system, such as ensemble optimal interpolation (e.g., Oke et al. 2002, 2005) or the ensemble Kalman filter (e.g., Evensen 2003). A critical aspect of ensemble-based data assimilation is the generation of the ensemble members that are here represented by the EOFs. Indeed, the analyses given by (2.1) are simply a time mean plus a linear combination of EOFs, where the amplitudes of the EOFs are determined by (2.3), through the column vector **c**. Similarly, analyses from more advanced assimilation systems are simply a background field plus a linear combination of ensemble members (Evensen 2003). We note that the amplitudes of the high-eigenvalue EOFs should be greater than those of the low-eigenvalues EOFs, according to their relative eigenvalues. However, in practice, we apply no constraint to these amplitudes when we solve (2.3). Ideally, we would prefer to replace the EOFs with a set of dynamically based, orthogonal basis functions that each represent equal variance, without the hierarchy of functions we obtain from an EOF decomposition. These could take the form of optimal perturbations derived from generalized stability theory (e.g., Farrel and Ioannou 1996). In the absence of such a basis, we proceed with an EOF approach.

### c. Optimal array design

**c**depends on how well the observations project onto the EOFs and, more specifically, on how well they distinguish between the different EOFs. By design, the columns of 𝗠 are orthogonal. If we normalize the columns of 𝗠, the resulting matrix,

For any application, there is no guarantee that an observation array exists that will satisfy (2.5). So in practice, we wish to configure the observation array (i.e., define 𝗛) so that (^{T}^{T}^{T}

A demonstration of the idea behind the approach described above is given in Fig. 2. This example considers the case where the basis functions are a simple sine and cosine. Suppose a field exists that is simply a linear combination of these two orthogonal functions and we seek to reconstruct the field using two observations and the analysis system (2.3). If the observations are made at locations that perfectly discriminate between the two functions (Fig. 2a), then the condition number of (^{T}^{T}^{T}*X*1 and *X*2 are the first and second observations, respectively). This demonstrates that for this very simple example, there are many choices for a two-point observing system and that provided the observations are separated by *π*/2, the relation (2.5) is satisfied.

To determine the locations that give the smallest condition number for 𝗛𝗠 in practice, we start with *n* observations, with an observation at every grid point. Each observation is excluded independently, one at a time, and the condition numbers of the *n* versions of (^{T}*n* − 1) observation locations. The process is then repeated, with each of the remaining observations excluded independently and the condition numbers of the (*n* − 1) versions of (^{T}*n* − 2) observation locations. This process is repeated until the desired number of observations is left. The final configuration of the observations is regarded as the most optimal array for this particular configuration of the analysis system. A simple example of this procedure is described in the appendix.

### d. OSSE design

We are motivated to assess the proposed mooring array (Fig. 1) and to subsequently employ the methodology described above to design an improved mooring array for resolving different time scales of variability. We therefore apply this system for two oceanic variables: the depth of the 20°C isotherm (D20), representing the interannual variability, and the high-pass-filtered mixed layer depth (MLD), representing the intraseasonal variability (e.g., Shinoda and Hendon 1998). The MLD, here defined as the depth over which the potential density decreases by 0.1 kg m^{−3} compared to the surface density, is filtered by removing a 93-day running mean. Experiments using an unfiltered MLD give very similar results to those using D20, since the unfiltered MLD is dominated by seasonal time scales. Because the model grid is not uniform, with higher resolution near the equator, the modeled fields are weighted according to the square root of the area of each grid cell before the EOF decomposition is performed. It is important to note that the degree to which the arrays derived from this study are relevant critically depends on the degree to which the model fields give a realistic representation of the variability in the real world.

To assess the performance of the analysis system, we calculate EOFs using different time series from the 12-yr model run. Specifically, we compute EOFs from years 1–6, 7–12, and 1–12, giving us 3 realizations of the EOFs, 2 of which are arguably independent. There is likely to be some artificial skill in the analysis system because we are guaranteed that the types of features resolved in the observations are consistent with the features representable by the EOFs. In an attempt to limit this artificial skill, we use a cross-validation approach (Barnston and Ropelewski 1992), by using one dataset to verify the analyses and a different dataset to train the analysis system through the calculation of the EOFs. For the OSSEs presented in section 3, the EOFs used by the analysis system in (2.1) are derived from years 1–6 and the true fields are the model field from years 7–12. The true fields are intended to represent the real ocean. Analyses are computed every 3 days for years 7–12 by sampling the true fields at observation locations and solving (2.1)–(2.3). Because we are not attempting to assess the absolute performance of the analysis system, we have not contaminated the observations with noise of any kind. This is not necessary here because we are only attempting to assess the relative performance of the analysis system for different configurations and for different observing systems.

In addition to the OSSEs using a TIO mooring array, we also attempt to simulate an integrated observing system that includes fixed observations from a simulated Argo array. The locations of the idealized Argo array are shown in Fig. 1b. The full Argo array is intended to have nominal horizontal resolution of 3°, with one temperature and salinity profile obtained every 10 days (Argo Science Team 1998). We have chosen to simulate an idealized Argo array with a horizontal resolution of 6°, since we only expect about half of the Argo observations to be available for any single analysis. We also note that, while we have included OSSEs using simulated Argo observations for the intraseasonal MLD, the temporal sampling of one profile every 10 days is inadequate to properly resolve the short time-scale variability associated with intraseasonal oscillations (Schiller et al. 2004). Because we are focused here on determining the locations of a mooring array, rather than an integrated array, we have not focused our attention on experiments that take Argo observations into consideration.

## 3. Results

### a. Model variability

Before proceeding with the results from a series of OSSEs, we examine some salient features of the modeled circulation that are relevant to this study. The mean fields of D20 and MLD from years 1–12 of the model run and from the National Oceanographic Data Center (2001) are shown in Fig. 3. The mean D20 compares well with that from the National Oceanographic Data Center (2001), showing the shallowest D20 fields at about 8°S in the southwest TIO. The mean modeled MLD shows the shallowest MLDs a few degrees south of the equator, in qualitative agreement with the results of the National Oceanographic Data Center (2001).

The cumulative percent variance explained by the dominant EOF modes for D20 and MLD, averaged over three different time series (years 1–6, 7–12, and 1–12), are shown in Fig. 4. This analysis demonstrates that the first few EOFs explain most of the variability, with the first EOF explaining up to 30% of the total variance. The first 6, 12, and 24 EOF modes explain about 72%, 85%, and 94% of the variance of D20 and 68%, 77%, and 87% of the variance of MLD.

To quantify the degree to which the dominant EOFs represent the total variability in the modeled TIO, we present the standard deviation of D20 and MLD for years 1–12 of the model in Fig. 5. For comparison, we also show the residuals of the reconstructed field using EOFs 1–6 and 1–12. Here, the residual is the original field minus the reconstructed field. The EOFs used for this analysis are the same as those used in the OSSEs described below, from years 1–6. The residual fields in Fig. 5 can therefore be regarded as a lower limit of what we might expect for the RMSE in the OSSEs that use these EOFs and map the observations perfectly.

The analysis presented in Fig. 5 shows that the strongest interannual variability is south of the equator, and particularly south of 8°S (Masumoto and Meyers 1998; Schott and McCreary 2001). The residual fields demonstrate that the reconstruction of D20 using EOFs 1–12 explains virtually all of the variability in the model fields. By contrast, the reconstruction using EOFs 1–6 represents proportionally less of the original signal, with regions of relatively high residual in the Bay of Bengal and south of about 8°S, particularly in the southwest TIO.

The fields in Fig. 5 demonstrates that the reconstructured MLD using EOFs 1–12 typically explains about 70% of the original fields, with relatively high residuals in the northeast TIO, where MLD has its strongest signal, with a local maximum along the equator between 90° and 100°E (Webster et al. 2002). The reconstruction using EOFs 1–6 again explains proportionally less of the original fields. The magnitudes of the residuals using EOFs 1–6 are comparable to the standard deviations in the northeast TIO. The residuals of the reconstructed fields for D20 are quite small relative to the standard deviation of the original field (Fig. 5). This demonstrates that the variations of D20 are typically spatially coherent and are therefore well represented by the high-eigenvalue EOFs. By contrast, the residuals for the reconstructed fields for the intraseasonal MLD are somewhat large relative to the standard deviation of the original field. This demonstrates that the variations of the MLD are less spatially coherent than is D20, requiring more EOFs to explain the variability across the whole TIO. This can also be seen in Fig. 4, which shows that the first 6 and 12 EOFs of the MLD explain less of the total variance than the first 6 and 12 EOFs of D20. This suggestion is consistent with the findings of Schiller et al. (2004), who showed that fairly dense spatial deployments of Argo floats are required to adequately observe oceanic variations on intraseasonal times scales. We therefore do not expect the analysis system in (2.1)–(2.3) to work as well for the MLD as for D20, using the same number of EOFs, since less of the detail of the MLD is resolved by the dominant EOFs compared to those of D20.

### b. Assessment of the proposed array

An assessment of the suitability of the proposed mooring array (Fig. 1) for the analysis system described above is provided through a series of OSSEs using the experiment design outlined in section 2d. We perform OSSEs for D20 and MLD independently using the first 6 and 12 EOFs and simulated observations from the locations of the proposed mooring array only, without including observations from Argo locations. The root-mean-squared errors (RMSEs) between the true fields and the analyzed fields for years 7–12 are shown in Fig. 6 for this series of OSSEs. Here, the error is defined as the true field minus the analyzed field. In all OSSEs considered, the RMSEs within 5° of the equator are small and are comparable to the residuals of the corresponding reconstructed fields presented in Fig. 5. This demonstrates that the analyses are quite good in this region. However, in the OSSEs using 12 EOFs there are regions of high RMSEs to the south of 8°S and around India (Fig. 6). This is also a region where the standard deviation of D20 is high (Fig. 5a). We conclude that the proposed mooring array does a good job of resolving the interannual variability over most of the TIO, but may undersample south of 8°S and in the Indian Ocean coastal currents.

For all configurations, the intraseasonal MLD is quite poorly represented in the analyses (Fig. 6). Based on the residuals presented in Fig. 5, this is not surprising. In many locations, the RMSE is comparable to, or greater than, the standard deviation of the intraseasonal MLD (Fig. 5d). Furthermore, the RMSEs in the OSSEs are greater than the residuals in Fig. 5 over the entire TIO. We therefore conclude that the proposed array is unlikely to adequately resolve the intraseasonal variability, represented here by the 93-day high-pass-filtered MLD.

### c. Optimal array

We apply a procedure for determining the optimal array of observations for D20 and MLD using the method described in section 2c. We determine an optimal array for D20 and MLD that consists of the best 33 observations (the same number as the proposed array), in the absence of Argo observations, for the analysis system when 6 and 12 EOFs are used. The details of the optimal array for the different configurations of the analysis system are shown in Fig. 7, along with the RMSEs of the analyzed fields. A different optimal array is determined for each configuration of the analysis system and for D20 and MLD independently. The first point to note here is that the optimal arrays for D20 and MLD are very different. The optimal arrays for D20 have similar characteristics, showing observations concentrated south of 8°S and aligned along the Indonesian coast. By contrast, the different optimal arrays for MLD show somewhat different distributions. The case with six EOFs shows observations concentrated along the equator. Whereas the configuration with 12 EOFs shows observations scattered more evenly over the TIO, perhaps showing a tendency for there to be more observations in the central and eastern TIO.

A comparison between the RMSE using an optimal array (Fig. 7) and the proposed array (Fig. 6) demonstrates that the RMSE using an optimal array is much smaller. The most significant improvement in the analyses using an optimal array compared to the proposed array is for D20, particularly for the OSSEs using 12 EOFs. The large RMSEs around India and south of 8°S using the proposed array are not evident in the OSSEs using an optimal array, with very small RMSEs evident in those regions. We note that the RMSEs in Fig. 7 are very similar to the RMSEs of the corresponding residual maps in Fig. 5. This supports the suggestion that the arrays in Fig. 7 are close to optimal for the specific configurations of the analysis system.

For MLD the RMSE is lower for all of the OSSEs using the optimal array (Figs. 7c and 7d) compared to the OSSEs using the proposed array (Figs. 6c and 6d). Similarly, the RMSEs using the optimal arrays (Figs. 7c and 7d) are very similar to the RMSEs of the corresponding residuals of the reconstructed fields (Figs. 5e and 5f). This demonstrates that the analysis system has some skill when an optimal array is used.

### d. Consolidated array

It is clear from the configurations of the optimal arrays presented in Fig. 7 that the details of the optimal array are sensitive to the configuration of the analysis system, particularly to the number of EOF modes included in the analysis system and the variable to which the system is applied. Additionally, we recognize that the unstructured nature of the optimal arrays discussed above would make them extremely difficult to implement and maintain due to logistical reasons. As a consequence of this, it is clear that no single realization of an optimal array identified in this study is appropriate for the Indian Ocean mooring array. We therefore seek to describe a consolidated array that incorporates the general features of the individual optimal arrays and is more structured. To construct the consolidated array, we generate a map of the relative frequencies (RFs) of the locations identified in the OSSEs described above (Fig. 8). There are 12 OSSEs (using the three time series, using 6 and 12 EOFs, with and without Argo) for each variable, making a total of 24 OSSEs. These RF maps are produced for a grid that has resolutions of 6° and 2.5° in the zonal and meridional directions, respectively. This grid is deliberately chosen to be much coarser than the model grid in order to force there to be more structure in the consolidated mooring array. The RF map for D20 (Fig. 8a) demonstrates that D20 is well observed with observations around 12°S in the central and western Indian Ocean, and off the coast of Sumatra. By contrast, the RF map for the MLD (Fig. 8b) suggests that the MLD is well observed along the equator in the central Indian Ocean, as well as a couple of degrees south in the western Indian Ocean. Figure 8c shows the RF map of the combined OSSEs for D20 and MLD. This map is used to objectively construct the consolidated array. The 33 locations with the highest RFs in the 24 OSSEs considered here are chosen as locations for the consolidated array. This array reflects the general features of the RF maps for D20 and MLD.

The OSSEs described in sections 3b and 3c are repeated using observations from the consolidated array. The RMSEs for these OSSEs are shown in Fig. 9. These results show that for all configurations of the analysis system, the consolidated array outperforms the proposed array, but is not quite as good as the optimal array. For both D20 and MLD, the OSSEs using the consolidated array and 6 or 12 EOFs give RMSEs that are comparable to those of the optimal array (Fig. 7) and are close to the residuals in Fig. 5. This demonstrates that the consolidated array is close to optimal and some features in it could be considered for the TIO mooring array.

### e. Statistical comparison of arrays

The OSSEs presented above demonstrate that the performance of the analysis system is sensitive to the different observation arrays that are used. In all cases, the optimal array outperforms the proposed and consolidated arrays. To further demonstrate this and to relate this to the optimization method described in section 2, the basin-averaged cross correlations and RMSEs are shown in Fig. 10, along with the condition number of (^{T}

Comparison of the RMSEs and cross correlations in the OSSEs with the condition number of the array for each configuration (Fig. 10c) demonstrates that the condition number for the proposed array is much greater than that of both the consolidated and optimal arrays, and further that the condition number for the consolidated array is always greater than for the optimal array. While the condition number of (^{T}

We find that when 6 EOFs are used, the condition number for the optimal array is nearly 1.0, and that this increases for the OSSEs using 12 EOFs. This indicates that 33 observations can discriminate between 6 EOFs better than they can discriminate between 12 EOFs, as we expect. In addition to the OSSEs described above, we have also performed a series of OSSEs using more EOFs. We find that when we use more than 20 EOFs, the condition number for all OSSEs increases by an order of magnitude compared to when 12 EOFs are used. This demonstrates that a 33-station mooring array does not properly resolve, or distinguish between, 20 EOFs or more. As a consequence of this, results from OSSEs using more EOFs give the counterintuitive result that the RMSE actually increases when more information is included through the extra EOFs. This issue is resolved when an alternative analysis system that formally weights the observation and background errors, such as ensemble optimal interpolation (Oke et al. 2002, 2005), is used.

## 4. Discussion

One of the limitations of the approach that we present above is that a derived optimal array is unstructured and is therefore not necessarily practical. With moorings scattered throughout the ocean, routine maintenance of such an array may be prohibitively difficult and expensive. Indeed, both the Tropical Atmosphere Ocean (TAO) and Pilot Research Moored Array in the Tropical Atlantic (PIRATA) arrays are structured with moorings along lines of constant longitude and/or latitude. While we have attempted to address this issue through the consolidated array (Fig. 9), this array still lacks the degree of structure in the TAO, PIRATA, and indeed the proposed array (Fig. 1). It may therefore be more useful to consider only the general trends in the different configurations of the optimal arrays. For example, consider the meridional distribution of the best observation locations (from Fig. 8), with and without Argo observations, represented in the RF histograms in Fig. 11. These distributions do not change much when Argo observations are included. Figure 11 indicates that there is a tendency for the best observations of D20 to be made south of 10°S, where seasonal Rossby waves are known to be prevalent (Masumoto and Meyers 1998; Schouten et al. 2002), and to a lesser extent, 4°–5° north and south of the equator, where equatorial Rossby waves typically propagate. Importantly, the distribution in Fig. 11 suggests that observations directly along the equator are not important for resolving interannual variability. However, we note that this may, in part, be a reflection of the model’s inability to adequately represent all of the dynamical processes that are important along the equator, for example, equatorial Kelvin waves.

The meridional distribution of the best observation locations for intraseasonal MLD (Fig. 11) shows that the best places for monitoring intraseasonal oscillations tend to be along the equator, and perhaps a few degrees south of the equator. These are arguably the latitudes of the maximum amplitudes of intraseasonal oscillations in the TIO (Webster et al. 1998).

Consideration of the zonal distribution of the best observation locations (Fig. 12) show that the best longitudes for resolving variations of D20 tend to be west of 60°E and at 95°E off of Sumatra (Figs. 8 and 12a). The region west of 60°E corresponds to the longitudes where the interannual variations of D20 are greatest (Fig. 5a) and is also where the largest amplitudes in the IOD are known to occur (Saji et al. 1999; Webster et al. 1999). Similarly, the best longitudes for resolving the intraseasonal MLD tend to be around 90°E. Again, this region has been identified by Webster et al. (2002) as the site where intraseasonal oscillations have the largest amplitude.

The RF maps (Fig. 8) and the consolidated array (Fig. 9) demonstrate that the best locations for monitoring seasonal to interannual variations are south of 8°S, at about 4°–5° from the equator and along the coast of Indonesia. These regions correspond to the locations of the maximum amplitude of seasonal Rossby waves (Masumoto and Meyers 1998; Schouten et al. 2002), equatorial Rossby waves, and strong IOD events (Murtugudde et al. 2000), respectively, and were also identified as key observation locations by Ballabrera-Poy et al. (2007).

We find that the best locations for resolving the intraseasonal variability are along the equator, and particularly east of 70°E. We note that the consolidated array that we present in Fig. 8c includes a mooring line at about 95°E that extends from 5°S to 10°N. This is similar to the optimal array proposed by Ballabrera-Poy et al. (2007), who identified that an optimal array includes a mooring line at 95°E that extends from 12°S to 10°N. The proposed array includes a mooring line at 95°E that extends from 14° to 3°S and is supplemented by a mooring line at 90°E that extends from the equator to 15°N (Fig. 1). We also identified the importance of a mooring array south of India, at 77°E that is similarly identified as being important by Ballabrera-Poy et al. (2007) at about 76°E.

We note that many realizations of the optimal array (Fig. 8) indicate that an important region for observing both D20 and MLD is along the coast of Indonesia and, particularly, off of Sumatra. This is a region where upwelling is very strong during oscillations of the IOD (e.g., Saji et al. 1999; Webster et al. 1999) and can be regarded as a coastal extension of the tropical waveguide (Sprintall et al. 2000; Wijffels and Meyers 2004). We therefore regard it as a logical place to monitor the TIO circulation. However, we understand that this region is very heavily fished, and that vandalism of these surface moorings may be a serious consideration. Vecchi and Harrison (2007) also note that the region off the coast of Indonesia is important, particularly for interpreting interannual variability. They suggest that weekly XBT surveys or autonomous gliders may be appropriate choices for sampling in this region.

## 5. Conclusions

Through a series of OSSEs we conclude that while the proposed array does a reasonable job of resolving seasonal to interannual variabilities, it may not adequately resolve intraseasonal variability. In general, we suggest that the proposed array probably oversamples the region within 3° of the equator in the western TIO and undersamples the region south 8°S where seasonal Rossby waves are prevalent. Based on results from a series of OSSEs using a reduced-order Kalman filter, Ballabrera-Poy et al. (2007) also suggest that the proposed array may oversample the region within a few degrees of the equator. However, Vecchi and Harrison (2007) suggest that Argo observations should give good coverage poleward of 5° and that equatorward of 5° moored buoys and XBT lines are essential for completing the integrated observing system in the Indian Ocean.

We apply a procedure for objectively determining the array of observations that is optimal for different configurations of the analysis system presented in section 2. We find that the configuration of the optimal array is sensitive to the details of the analysis system, namely the number of EOFs used and the period over which the EOFs are calculated. The optimal array is also different when fixed observations are included that represent an idealized Argo array, thus simulating an integrated observing system. We therefore compile an RF map of optimal arrays for D20 and MLD for different configurations of the analysis system and for applications with and without Argo observations (24 different configurations in total). We use this RF map to objectively determine a single, albeit less optimal, array that we refer to as the consolidated array. We repeat the series of OSSEs using the consolidated array and show that it outperforms the proposed array for all configurations of the analysis system for both D20 and MLD. Although there are limitations to the method described in this study, we argue that some aspects of the consolidated array should be considered prior to the implementation of the mooring array. This approach could be strengthened by a multimodel application, thus giving more robust and reliable results.

## Acknowledgments

This research is funded by Australia’s CSIRO through appropriation funding and by the U.S. Office of Naval Research’s Ocean Modeling Program through Grant N000140410345. The authors thank Pavel Sakov, Stuart Corney, and Gary Brassington for useful discussions during the course of this research.

## REFERENCES

Argo Science Team, 1998: On the design and implementation of Argo: An initial plan for a global array of profiling floats. International CLIVAR Project Office Rep. 21, GODAE Rep. 5, GODAE Project Office, Melbourne, Australia, 32 pp.

Ballabrera-Poy, J., E. Hackert, R. Murtugudde, and A. J. Busalacchi, 2007: An observing system simulation experiment for an optimal moored instrument array in the tropical Indian Ocean.

,*J. Climate***20****,**3284–3299.Barnston, A. G., and C. F. Ropelewski, 1992: Prediction of ENSO episodes using canonical correlation analysis.

,*J. Climate***5****,**1316–1345.Barth, N., and C. Wunsch, 1990: Oceanographic experiment design by simulated annealing.

,*J. Phys. Oceanogr.***20****,**1249–1263.Bishop, C. H., B. J. Etherton, and S. J. Majumdar, 2001: Adaptive sampling with the ensemble transform Kalman filter. Part I: Theoretical aspects.

,*Mon. Wea. Rev.***129****,**420–436.Chen, D., L. M. Rothstein, and A. J. Busalacchi, 1994: A hybrid vertical mixing scheme and its application to tropical ocean models.

,*J. Phys. Oceanogr.***24****,**2156–2179.CLIVAR–GOOS Indian Ocean Panel and Coauthors, 2006: Understanding the role of the Indian Ocean in the climate system—Implementation plan for sustained observations. WCRP Informal Rep. 5/2006, ICOP Publ. Series 100, GOOS Rep. 152, 76 pp. [Available online at http://www.ioc-goos.org/index.php?option=com_oe&task=viewDoclistRecord&doclistID=3.].

Evensen, G., 2003: The ensemble Kalman filter: Theoretical formulation and practical implementation.

,*Ocean Dyn.***53****,**343–367.Farrel, B. F., and P. J. Ioannou, 1996: Generalized stability theory. Part II: Nonautonomous operators.

,*J. Atmos. Sci.***53****,**2041–2053.Haber, J., F. Zeilfelder, O. Davydov, and H-P. Seidel, 2001: Smooth approximation and rendering of large scattered data sets.

*Proc. IEEE Visualization 2001*, 341–347. [Available online at http://doi.ieeecomputersociety.org/10.1109/VISUAL.2001.964530.].Hackert, E. C., R. N. Miller, and A. J. Busalacchi, 1998: An optimized design for a moored instrument array in the tropical Atlantic Ocean.

,*J. Geophys. Res.***103****,**7491–7509.Hirschi, J., J. Baehr, J. Marotzke, J. Stark, S. Cunningham, and J-O. Beismann, 2003: A monitoring design for the Atlantic meridional overturning circulation.

,*Geophys. Res. Lett.***30****.**1413, doi:10.1029/2002GL016776.International Core Steering Committee for THORPEX, 2005: Final report. WMO Tech. Doc. 1268, WWRP/THORPEX 4, CD-ROM.

Kalnay, E., and Coauthors, 1996: The NCEP/NCAR 40-Year Reanalysis Project.

,*Bull. Amer. Meteor. Soc.***77****,**437–471.Kleeman, R., and S. B. Power, 1995: A simple atmospheric model of surface heat flux for use in ocean modeling studies.

,*J. Phys. Oceanogr.***25****,**92–105.Kuo, T. H., X. Zou, and W. Huang, 1998: The impact of global positioning system data on the prediction of an extratropical cyclone: An observing system simulation experiment.

,*Dyn. Atmos. Oceans***27****,**439–470.Legler, D. M., I. M. Navon, and J. J. O’Brien, 1989: Objective analysis of pseudostress over the Indian Ocean using a direct-minimization approach.

,*Mon. Wea. Rev.***117****,**709–720.Levitus, S., R. Burgett, and T. P. Boyer, 1994:

*Salinity*. Vol. 3,*World Ocean Atlas 1994*, NOAA Atlas NESDIS 3, 99 pp.Masumoto, Y., and G. Meyers, 1998: Forced Rossby waves in the southern tropical Indian Ocean.

,*J. Geophys. Res.***103****,**27589–27602.McIntosh, P. C., 1987: Systematic design of observational arrays.

,*J. Phys. Oceanogr.***17****,**885–902.Murtugudde, R., J. P. McCreary, and A. J. Busalacchi, 2000: Oceanic processes associated with anomalous events in the Indian Ocean with relevance to 1997–1998.

,*J. Geophys. Res.***105****,**3295–3306.National Oceanographic Data Center, cited. 2005:

*World Ocean Database 2001*. [Available online at http://www.nodc.noaa.gov/OC5/WOD01/pr_wod01.html.].Oke, P. R., J. S. Allen, R. N. Miller, G. D. Egbert, and P. M. Kosro, 2002: Assimilation of surface velocity data into a primitive equation coastal ocean model.

,*J. Geophys. Res.***107****.**3122, doi:10.1029/2000JC000511.Oke, P. R., A. Schiller, D. A. Griffin, and G. B. Brassington, 2005: Ensemble data assimilation for an eddy-resolving ocean model of the Australian region.

,*Quart. J. Roy. Meteor. Soc.***131****,**3301–3311.Pacanowski, R. C., 1995: MOM2 documentation user’s guide and reference manual, version 1.0. GFDL Tech. Rep. 3, 232 pp.

Reynolds, R. W., and T. M. Smith, 1994: Improved global sea surface temperature analyses using optimum interpolation.

,*J. Climate***7****,**929–948.Saji, N. H., B. N. Goswami, P. N. Vinayachandran, and T. Yamagatta, 1999: A dipole mode in the tropical Indian Ocean.

,*Nature***401****,**360–363.Schiller, A., 1999: How well does a coarse-resolution circulation model simulate observed interannual variability in the upper Indian Ocean?

,*Geophys. Res. Lett.***26****,**1485–1488.Schiller, A., and J. S. Godfrey, 2003: Indian Ocean intraseasonal variability in an ocean general circulation model.

,*J. Climate***16****,**21–39.Schiller, A., J. S. Godfrey, P. C. Mclntosh, G. Meyers, and S. E. Wijffels, 1998: Seasonal near-surface dynamics and thermodynamics of the Indian Ocean and Indonesian Throughflow in a global ocean general circulation model.

,*J. Phys. Oceanogr.***28****,**2288–2312.Schiller, A., J. S. Godfrey, P. C. Mclntosh, G. Meyers, and R. Fiedler, 2000: Interannual dynamics and thermodynamics of the Indo-Pacific Oceans.

,*J. Phys. Oceanogr.***30****,**987–1012.Schiller, A., S. E. Wijffels, and G. A. Meyers, 2004: Design requirements for an Argo float array in the Indian Ocean inferred from observing system simulation experiments.

,*J. Atmos. Oceanic Technol.***21****,**1598–1620.Schott, F. A., and J. P. McCreary, 2001: The monsoon circulation of the Indian Ocean.

,*Prog. Oceanogr.***51****,**1–123.Schouten, W. P., M. de Ruijter, P. J. van Leeuwen, and H. A. Dijkstra, 2002: An oceanic teleconnection between the equatorial and southern Indian Ocean.

,*Geophys. Res. Lett.***29****.**1812, doi:10.1029/2001GL014542.Shinoda, T., and H. H. Hendon, 1998: Mixed layer modeling of intraseasonal variability in the tropical western Pacific and Indian Oceans.

,*J. Climate***11****,**2668–2685.Sprintall, J., A. Gordon, R. Mutugudde, and D. Susanto, 2000: A semi-annual Indian Ocean forced Kelvin wave observed in the Indonesian Seas.

,*J. Geophys. Res.***105****,**17217–17230.Stricherz, J., J. O’Brien, and D. Legler, 1992:

*Atlas of Florida State University Tropical Pacific Winds for TOGA 1966*–*1985*. The Florida State University, 256 pp.Vecchi, G. A., and M. J. Harrison, 2007: An observing system simulation experiment for the Indian Ocean.

,*J. Climate***20****,**3300–3319.Webster, P. J., O. V. Magana, T. N. Palmer, J. Shukla, R. A. Thomas, M. Tanai, and T. Yasunari, 1998: Monsoons: Processes, predictability, and the prospects for prediction.

,*J. Geophys. Res.***103****,**14451–14510.Webster, P. J., M. D. Moore, P. J. Loschnigg, and R. R. Leben, 1999: Coupled ocean–atmosphere dynamics in the Indian Ocean during 1997–98.

,*Nature***401****,**356–360.Webster, P. J., and Coauthors, 2002: The JASMINE pilot study.

,*Bull. Amer. Meteor. Soc.***83****,**1603–1630.Wijffels, S., and G. Meyers, 2004: An intersection of oceanic waveguides: Variability in the Indonesian Throughflow region.

,*J. Phys. Oceanogr.***34****,**1232–1253.

## APPENDIX

### Array Optimization for a Simple Example

_{1},

_{1}would correspond to the interpolated EOFs at the potential observation locations. Suppose we wish to eliminate two rows of 𝗔

_{1}(two potential observation locations), leaving an optimized array of only three observations. We compute the condition number of 𝗔′

_{1}

^{T}𝗔′

_{1}, where the prime indicates that a row from the matrix has been left out, for each of the five cases where each row is left out independently. We find that elimination of the first row results in a condition number of ∞, indicating that the observation that corresponds to this row is very important. Conversely, we find that elimination of the third row results in the smallest condition number. We therefore eliminate the third row, renormalize each column, and retain the matrix 𝗔

**:**

_{2}_{2}

^{T}𝗔′

_{2}for each of the four cases where each row is left out independently. We find that elimination of the first or second row results in a condition number of ∞, again indicating that the observations that correspond to these rows are very important. Conversely, we find that elimination of either the third or fourth row results in the same condition number of 2. We therefore eliminate either row and renormalize each column to retain the 3 × 3 identity matrix, an orthonormal matrix with a condition number of 1. So the optimal array is here defined as the observation locations that correspond to rows 1, 2, and either 4 or 5 of 𝗔

_{1}.

Plot of two orthonormal functions (sine, thin; cosine, bold) and observation locations, denoted by the black circles and the dotted vertical lines, for an example of (a) optimal locations that perfectly discriminate between the two orthogonal functions and (b) a suboptimal array that does not discriminate between the two orthogonal functions. The matrix 𝗛𝗠 is shown for (a) and (b). (c) A map of the condition number of (^{T}

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

Plot of two orthonormal functions (sine, thin; cosine, bold) and observation locations, denoted by the black circles and the dotted vertical lines, for an example of (a) optimal locations that perfectly discriminate between the two orthogonal functions and (b) a suboptimal array that does not discriminate between the two orthogonal functions. The matrix 𝗛𝗠 is shown for (a) and (b). (c) A map of the condition number of (^{T}

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

Plot of two orthonormal functions (sine, thin; cosine, bold) and observation locations, denoted by the black circles and the dotted vertical lines, for an example of (a) optimal locations that perfectly discriminate between the two orthogonal functions and (b) a suboptimal array that does not discriminate between the two orthogonal functions. The matrix 𝗛𝗠 is shown for (a) and (b). (c) A map of the condition number of (^{T}

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

Mean (top) modeled and (bottom) observed (from Levitus et al. 1994) fields of (left) D20 and (right) MLD. Contour intervals (CIs) are 10 m (2.5 m) for D20 (MLD) and contours of less that 120 m (35 m) are halftones.

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

Mean (top) modeled and (bottom) observed (from Levitus et al. 1994) fields of (left) D20 and (right) MLD. Contour intervals (CIs) are 10 m (2.5 m) for D20 (MLD) and contours of less that 120 m (35 m) are halftones.

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

Mean (top) modeled and (bottom) observed (from Levitus et al. 1994) fields of (left) D20 and (right) MLD. Contour intervals (CIs) are 10 m (2.5 m) for D20 (MLD) and contours of less that 120 m (35 m) are halftones.

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

Cumulative percent variance explained by the dominant EOF modes for D20 (bold) and MLD (thin).

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

Cumulative percent variance explained by the dominant EOF modes for D20 (bold) and MLD (thin).

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

Cumulative percent variance explained by the dominant EOF modes for D20 (bold) and MLD (thin).

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

(a),(d) Std dev of (left) D20 and (right) MLD, and the residuals from the reconstructed fields using (b),(e) 6 and (c),(f) 12 EOFs. CIs are 2.5 m (0.5 m) for D20 (MLD) and contours of less than 15 m (5 m) for D20 (MLD) are halftones.

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

(a),(d) Std dev of (left) D20 and (right) MLD, and the residuals from the reconstructed fields using (b),(e) 6 and (c),(f) 12 EOFs. CIs are 2.5 m (0.5 m) for D20 (MLD) and contours of less than 15 m (5 m) for D20 (MLD) are halftones.

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

(a),(d) Std dev of (left) D20 and (right) MLD, and the residuals from the reconstructed fields using (b),(e) 6 and (c),(f) 12 EOFs. CIs are 2.5 m (0.5 m) for D20 (MLD) and contours of less than 15 m (5 m) for D20 (MLD) are halftones.

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

RMSE for (left) D20 and (right) MLD analyses from OSSEs for years 7–12 using (a),(c) 6 and (b),(d) 12 EOFs. The observation locations are denoted by the squares. CIs are 2.5 m (0.5 m) for D20 (MLD) and contours of less than 15 m (5 m) are halftones.

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

RMSE for (left) D20 and (right) MLD analyses from OSSEs for years 7–12 using (a),(c) 6 and (b),(d) 12 EOFs. The observation locations are denoted by the squares. CIs are 2.5 m (0.5 m) for D20 (MLD) and contours of less than 15 m (5 m) are halftones.

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

RMSE for (left) D20 and (right) MLD analyses from OSSEs for years 7–12 using (a),(c) 6 and (b),(d) 12 EOFs. The observation locations are denoted by the squares. CIs are 2.5 m (0.5 m) for D20 (MLD) and contours of less than 15 m (5 m) are halftones.

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

As in Fig. 6 but using an optimal array.

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

As in Fig. 6 but using an optimal array.

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

As in Fig. 6 but using an optimal array.

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

Map of the relative frequency that locations are selected in the OSSEs described in section 3 for (a) D20 and (b) MLD, and (c) both D20 and MLD. The locations of the consolidated array are also plotted in (c).

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

Map of the relative frequency that locations are selected in the OSSEs described in section 3 for (a) D20 and (b) MLD, and (c) both D20 and MLD. The locations of the consolidated array are also plotted in (c).

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

Map of the relative frequency that locations are selected in the OSSEs described in section 3 for (a) D20 and (b) MLD, and (c) both D20 and MLD. The locations of the consolidated array are also plotted in (c).

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

As in Fig. 6 but using the consolidated array.

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

As in Fig. 6 but using the consolidated array.

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

As in Fig. 6 but using the consolidated array.

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

Basin-averaged RMSEs (row a) and cross correlations (row b). The thick gray lines denote the RMS residual (row a) and the cross correlation (row b) between the original and reconstructed fields using 6 (left line) and 12 (right line) EOFs for years 1–6 from Fig. 5. Row c shows the condition number of (^{T}

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

Basin-averaged RMSEs (row a) and cross correlations (row b). The thick gray lines denote the RMS residual (row a) and the cross correlation (row b) between the original and reconstructed fields using 6 (left line) and 12 (right line) EOFs for years 1–6 from Fig. 5. Row c shows the condition number of (^{T}

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

Basin-averaged RMSEs (row a) and cross correlations (row b). The thick gray lines denote the RMS residual (row a) and the cross correlation (row b) between the original and reconstructed fields using 6 (left line) and 12 (right line) EOFs for years 1–6 from Fig. 5. Row c shows the condition number of (^{T}

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

Relative frequency histogram of the meridional distribution of the 33 optimal observation locations for the OSSEs described in section 3 for (left) D20 and (right) MLD.

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

Relative frequency histogram of the meridional distribution of the 33 optimal observation locations for the OSSEs described in section 3 for (left) D20 and (right) MLD.

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

Relative frequency histogram of the meridional distribution of the 33 optimal observation locations for the OSSEs described in section 3 for (left) D20 and (right) MLD.

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

As in Fig. 11 but for the zonal distributions.

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

As in Fig. 11 but for the zonal distributions.

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1

As in Fig. 11 but for the zonal distributions.

Citation: Journal of Climate 20, 13; 10.1175/JCLI4170.1