1. Introduction
The endurance and economy of Argo profiling floats revolutionized how oceanographers practice their craft by providing unprecedented spatially distributed and frequent observations. Since the development of the first Argo prototypes almost 30 years ago (Davis et al. 1992), the global Argo array has grown to include more than 3800 active floats and has collected over 20 years of data (Roemmich et al. 2019). Core Argo floats observe temperature and salinity from 2000 m to the surface every 10 days. The recent integration of miniaturized nitrate, oxygen, pH, and optical sensors has enabled the development of the biogeochemical (BGC) Argo float (Johnson et al. 2017). The Southern Ocean Carbon and Climate Observations and Modeling (SOCCOM) project (Johnson and Claustre 2016; Talley et al. 2019) has successfully deployed an array of BGC Argo floats in the Southern Ocean, using the same observing protocol as core Argo, and recently funded projects in multiple countries are deploying them globally as part of OneArgo.
The substantially higher costs of BGC Argo floats relative to core Argo floats accelerates the need for informed strategic decisions about the optimal deployment positions for the BGC array. Any observing array should be optimized to sample the temporal and spatial scales of the phenomena of interest. Each variable measured by BGC floats has distinct scales of spatial and temporal variability. In the past, core Argo array deployment locations were selected to optimize a uniform distribution with a spacing of about 3° × 3° separation in latitude and longitude (Davis 1991; Roemmich et al. 1998). With increased prior knowledge of the ocean state, recent core Argo array proposals have suggested increasing array density where variability is higher, along the equator and in western boundary regions (Roemmich et al. 2019). The Biogeochemical Argo Implementation Plan (Johnson and Claustre 2016) explored spatial variability of measured BGC variables, concluding that initial deployments should try to achieve a uniform distribution, but finally suggesting that uniform deployments should “be tested as more experience is obtained” (Johnson and Claustre 2016).
This raises the question: Should we update recommended deployment strategies now that more BGC Argo floats have been deployed? Past studies have quantified the improvement BGC float observations make to either modeled and calculated fields of an individual BGC variable for both random sampling (Johnson and Claustre 2016; Majkut et al. 2014) and snapshots of past Argo array distributions (Ford 2021; Kamenkovich et al. 2017). However, optimal design strategies on a global scale for the BGC Argo array deserve dedicated studies.
The work presented here addresses the operational concern of predicting future float locations. It is one piece of an expanded effort of an optimal ship-based float deployment strategy for the integrated Core and BGC Argo arrays. We propose an array design strategy that consists of three innovations:
-
The system will statistically predict the future location of currently deployed instruments to recognize the gaps in coverage at the time of deployment.
-
The array strategy will account for the global inhomogeneities of BGC variables in spatial covariance and temporal variance by putting greater float density in regions of high temporal variability and low spatial covariance.
-
The optimal array strategy will account for the cross covariance of the full BGC Argo float sensor suite by considering the additional constraint imposed by the prior knowledge of covarying properties.
The latter two developments are left for a companion manuscript (Chamberlain et al. 2023). Several successful pilot projects have created regional BGC float arrays in regions that play an outsized role in global biogeochemistry (Morrison et al. 2015). The largest regional array, the SOCCOM program, deploys floats in unique provinces of the Southern Ocean by analyzing observed float trajectories and numerical particle release experiments (Talley et al. 2019). This approach has shortcomings: visual analysis of previous Argo trajectories is subjective, while particle release experiments do not consistently reproduce actual Argo trajectories (Talley et al. 2019).
Indeed, the ocean is a complex system, and Lagrangian trajectories can be challenging to predict deterministically. Figure 1 shows an example of historic float trajectories passing through a region off Cape Agulhas; there exists a time-varying bifurcation of float trajectories in this complex current system (Boebel et al. 2003; Van Sebille et al. 2010). Argo float trajectories depend on the mesoscale eddy field, which models may not resolve. Eddies result from the intrinsic instabilities of the ocean, and even in eddy-resolving assimilating models the positions and timing of observed eddies can differ from modeled eddies due to constraints on dynamical consistency. The rectified effect of these unresolved or omitted processes is typically stochastically parameterized in Lagrangian models (Van Sebille et al. 2018). Argo floats also experience ocean shear during their ascent and descent and are advected by winds, currents, and waves at the surface. Some Argo-derived velocity products do not include these processes (Gray and Riser 2014), or attempt to actively remove them (Sevellec et al. 2017; Ollitrault and Rannou 2013; Gille and Romero 2003).
Since Argo floats are not propelled, Argo managers should carefully choose deployment locations to optimize Argo array distribution throughout the lifetime of the float. To address these challenges, we generate a statistical model, known as a transition matrix, from existing Argo array trajectories to predict the probability density function (PDF) of future float locations. The large number of Argo float trajectories can be used to diagnose the probability that a float transitions from one location to the other in a given time step (Fig. 2). Transition matrices represent a potential complement to dynamical models because they contain a probabilistic representation of the complexities of the eddy field, ocean shear, and surface processes that models may miss. Transition matrices are also a way of quantifying the information gained from visually inspecting previous float trajectories.
Transition matrices are an established method (Markov 1906; Van Sebille et al. 2012; Maximenko et al. 2012; Sevellec et al. 2017; Drouin et al. 2022; Miron et al. 2022; Abernathey et al. 2022) to model semi-Lagrangian ocean drifters, and have been primarily applied to surface ocean drifters (Maximenko et al. 2012; Van Sebille et al. 2012) and surface drifter array design (Lumpkin et al. 2016). Sevellec et al. (2017) generated a transition matrix based on processed Argo trajectories obtained from the ANDRO dataset (Ollitrault and Rannou 2013) to study the evolution of deep water masses. The ANDRO dataset estimates the displacements of Argo floats at their drift depth and removes displacements due to ocean shear and surface currents. However, these displacements are important for operational float prediction and should not be removed, so for float deployment analyses a new transition matrix was required.
Here we first broadly explain the relatively simple theory of the construction and use of transition matrices (section 3). The biases and uncertainties of the estimates that transition matrices produce are sensitive to spatial and temporal spacing and are quantified in section 4a. In this section, we also define the criteria for the optimal transition matrix that minimizes these biases and uncertainties, based on the available datasets.
To apply this analysis, in section 4b we use our optimal transition matrix to estimate the future density of the existing core Argo array and the future array health of the core Argo and SOCCOM Arrays. In section 4c, we consider the potential future locations of floats deployed from upcoming GO-SHIP cruise tracks over the float life cycle of 5 years. Then, in section 4d, we quantify the different drift patterns of Argos system and Iridium-equipped floats using transition matrices derived from these different float trajectories.
BGC Argo floats do not all carry the same sensor suite, and some BGC sensors are more ubiquitous in the ocean than others. Estimating where individual BGC sensors will observe the ocean is important for BGC Argo managers to determine where to deploy BGC floats and where the gaps will be in our BGC observing systems. In section 4e, we quantify sampling probability by BGC sensor type.
Finally, in section 4f, the Argo float transition matrix is compared against a transition matrix derived from the modeled particle trajectories derived from the Southern Ocean State Estimate (SOSE) (Mazloff et al. 2010). Modeled particles were programmed to profile to the surface every 10 days, similar to real Argo floats, and were advected by SOSE currents.
2. Data
a. Argo floats
These results use Argo float trajectories from the October 2022 Argo snapshot (Argo 2022). The data processing for Argo files excluded trajectories with the following conditions: poor quality flags for position, time, and pressure; floats with problematic file formats; floats that were functioning in a manner outside of core Argo mission parameters of 1000 db drift depths and 10-day surface intervals; and floats that had unrealistic velocities or traveled over 500 nm in successive positions. After these quality control procedures, the total dataset comprised 2 167 492 positions collected by 14 331 unique floats. These trajectories were measured from 13 May 1998 to 10 October 2022, and spanned the globe from 77.7°S to 89.7°N.
Floats have used two different means for satellite communications: 8506 floats used the Argos system constellation, and 5825 floats used Iridium (Fig. 3). The Iridium and Argos constellations of satellites are both low-Earth orbiting. Floats transmit their data more efficiently to Iridium satellites than to Argos; therefore, Iridium-enabled floats typically spend less than 1 h at the surface compared to the 8 h that Argos-enabled floats usually spend at the surface.
b. Southern Ocean State Estimate
To validate the results based on Argo float trajectories, we also built an independent transition matrix from a Lagrangian particle release experiment in the SOSE (Mazloff et al. 2010). SOSE is an eddy-permitting 0.16° configuration of the Massachusetts Institute of Technology General Circulation Model (MITgcm), which is fit by constrained least squares to available Southern Ocean satellite and hydrographic observations. The current SOSE version (iteration 100) spans 6 years (2005–10) and is calculated from 24.7° to 78°S. For the particle experiment using Octopus (http://github.com/jinbow/Octopus), we randomly released 10 000 particles over the spatial domain and tracked their motion over the full 6 years of model output. The particles in this release experiment were programmed to drift at 1000 m, dive to 2000 m, then surface once every 10 days with an ascent mission and surface time similar to real Argo floats to simulate the effect of upper-ocean shear on particle trajectories.
3. Methods
First, we spatially and temporally quantize the trajectory data by a defined time step and spatial grid (Fig. 2). These choices define the dimension of the state space S and the nature of the discrete-time Markov chain. For our application, these choices determine the spatial grid of latitude and longitude that Argo floats transition between and the time step in days of these transitions.
A consideration of space–time resolution is the representativeness of positioning: the approximation-induced error of projecting continuous Argo trajectories onto a discrete spatial grid increases when the grid cells are larger. This approximation can lead to biases and uncertainties in estimating future float distribution produced by a transition matrix. For example, if the spatial grid size is too large, some floats may enter along the edge of grid cells and quickly leave; this will cause a bias in transition statistics compared with floats that transit across the entirety of the grid cell. Using large grid cells can also result in underestimating float motion when the grid spacing is larger than coherent structures in the mesoscale eddy field, as floats can recirculate within the grid cell without transitioning to an adjacent grid cell.
Decreasing gridcell size reduces position discrepancies between grid centers and edges. However, it can also lead to fewer transition data per grid cell as smaller grid cells will typically have fewer floats pass through them. Transition matrices are uninformed by dynamics and need many data to resolve skillful transition statistics. Section 4a explores the trade-offs between gridcell size, time step, and data density. The distribution of floats relative to the nature of the velocity field is another potential bias; unequally spaced float arrays placed in fields of inhomogeneous diffusivity may infer biased velocity statistics that do not resolve the mean (Freeland et al. 1975; Davis 1991). For example, many floats placed in the middle of a random velocity field with no mean flow may appear divergent due to simple Brownian motion. For these calculations, we assumed the float density to be homogeneous.
Float trajectories incorporated into the transition matrix must be temporally statistically independent. Successful floats carry out extended missions (5 years or more), much longer than any transition matrix considered in this analysis. Therefore, we break these longer float trajectories down into shorter trajectories equal to the time step of the transition matrix. Figure 2 shows the segments corresponding to these shorter trajectories as black circles. Our algorithm tolerated small overlaps between the shorter trajectories to increase data density. The time separation between the start of the smaller trajectories was the greater of either 30 days (Gille and Romero 2003) or one-third of the transition matrix time step; e.g., the time separation for the start of trajectories must be 30 days for a 60-day time step and 60 days for a 180-day time step.
We now describe how we quantify the transition probability for an arbitrary spatial grid cell (which we call Cellblue in Fig. 2). This process must be repeated for all spatial grid cells in the domain. The number of spatial grid cells is equal to S, the dimension of the state space. First, all of the independent Argo profiles that start in Cellblue are identified (colored circles in Fig. 2);
The probability of a float transitioning from Cellblue (arbitrarily of index k) to Cellred (arbitrarily of index q) is equal to
To quantify the transition matrix’s biases and uncertainties, we considered transition matrix performance at several different spatial and temporal resolutions. Time steps ranged from 30 to 180 days, and grid cells ranged in size from 1° × 1° to 4° × 6°. Table 1 lists these time steps and grid sizes.
Spatial resolution of all transition matrices calculated and the temporal resolution (time step) of transition matrices calculated for each spatial resolution.
Low trajectory density or errors in the trajectory dataset can create isolated grid cells disconnected from the rest of the transition matrix. These points have no predictive value and are removed. Eigenvector decomposition and analysis (Miron et al. 2019; Froyland et al. 2014) provides intuition about the connected float distribution modes; regions defined by eigenvectors with eigenvalues close to one tend to have closed circulations and are difficult for floats to leave. To eliminate isolated grid cells, we required all 50 largest eigenvectors to have at least three grid cells and a total number of at least three transitions in each grid cell of the transition matrix. The limiting values for minimum grid cells and transitions were chosen empirically based on the data density.
Ocean dynamics vary in time; therefore, the statistics of where floats are advected must also have seasonal and climate time scale variability. The Argo trajectory dataset is not sufficiently large to resolve seasonal dynamics, and, by necessity, we assumed these statistics are stationary in time. This assumption is a fundamental gap in our analysis, adding uncertainty to our estimates.
4. Results and discussion
Previous BGC Argo array design studies have considered both the actual Argo array at snapshots in time and randomly distributed float arrays (Johnson and Claustre 2016; Majkut et al. 2014; Ford 2021; Kamenkovich et al. 2017). The omission of float displacement has been a major limitation. Accounting for this float motion is critical for planning arrays over time spans long enough for instruments to drift significantly, as is the case with Argo. Ocean currents, and the trajectories of floats carried by these currents, follow predictable patterns. Inspired by this, we consider the construction and assessment of a transition matrix approach for float prediction in several applications: section 4a quantifies the biases and uncertainties of transition matrices of various spatial and temporal resolutions and presents our justification criteria for the optimal transition matrix; section 4b predicts the future distribution of the existing Argo array; section 4c predicts the future distribution of Argo floats deployed from planned GO-SHIP cruises within the next 5 years; section 4d estimates the regions of convergence and divergence for Argos system and Iridium floats; section 4e predicts the future sampling of existing BGC Argo floats broken down by sensor class; section 4f estimates the effective diffusivity of the SOSE model with derived transition matrices.
a. Bias and uncertainty quantification
Although computationally straightforward, a limiting assumption of the transition matrix is that it is a linear approximation to a nonlinear process (Lagrangian Argo trajectories in the ocean) (McAdam and van Sebille 2018). Given a dataset of finite size, the choice of resolution and time step is fundamentally a trade-off of model bias versus model uncertainty. Using short space and long time scales reduces the impact of errors from the linearity assumption (less bias). However, these dimensional choices also reduce the amount of data available to construct the transition matrix (more uncertainty). As such, the time step and grid spacing, which define the transition matrix, are factors that determine the accuracy of transition matrix prediction.
A representation of the difference in estimates derived from approximating a longer time step transition matrix by multiplying many short time step transition matrices together is shown for an example grid cell in the Antarctic Circumpolar Current (ACC) in Figs. 4a and 4b. While this is only one example, it illustrates the uncertainty in the linear approximation made in this method: approximating long-term float behavior with short-term float statistics will result in a smoothed PDF distribution. Indeed, this approximation-induced smoothing can change the first moment [Eq. (4)] of the future float PDF. Over long durations, these differences compound (Figs. 4c,d); in the aggregated, globally calculated statistics of this figure, misfit in mean transition and standard deviation generally decrease with increasing time step. The slope of the misfit decrease is similar among two groups of resolution: resolutions of 1° × 1°, 1° × 2°, 2° × 2°, and 2° × 3° of latitude and longitude; and resolutions of 3° × 3°, 4° × 4°, and 4° × 6° of latitude and longitude. The former had a lower misfit at the shortest time step and a shallower slope of misfit decrease with increasing time step. In comparison, the latter had a higher misfit at the shortest time step and showed a steeper misfit decrease with increasing time step. The misfit of the higher-resolution group also plateaued at the 90-day time step.
An example of the differences introduced by choices in spatial resolution is shown for the same example grid cell in the ACC (Figs. 5a,b). These differences have also been quantified for transition matrices of different grid resolutions and time steps (Fig. 5c). Mean misfit is proportional to time step and inversely proportional to resolution, and the misfit slope is generally lowest between the 60- and 90-day time steps.
Figure 6 shows standard error [Eq. (6)] for two matrices of differing resolution. Spatial area decreases geometrically with increasing resolution, and, broadly speaking, because the core Argo distribution is homogeneous, the number of Argo trajectories through a grid cell will be proportional to the gridcell size. Unsurprisingly, we see from this example that the lower-resolution transition matrix has a lower mean standard error and will produce estimates of the expected value and variance [Eqs. (4) and (5)] with less uncertainty.
For all transition matrices considered, the mean standard error is proportional to resolution and time step except for the 1° × 1° 30-day transition matrix, which has a lower mean standard error than its 1° × 2° 30-day counterpart (Fig. 5d). This is due to the specific criteria for transition matrix construction that exclude certain high-variance regions from the 1° × 1° transition matrix (viz., the southern ACC where trajectory variance is high and data density is low).
Comparing specific matrices and time steps, we notice that Fig. 4c shows substantial misfit improvement between the 1° × 2° and the 2° × 2° resolutions, but slight improvement between the 2° × 2° and the 2° × 3° resolutions, as well as a curvature minimum at the 90-day time step 2° × 2° transition matrix. Figure 4d shows a misfit minimum for the 2° × 2° resolution transition matrix. Figure 5c shows a misfit plateau between the 60 and 90 time steps in the 2° × 2° resolution. For these reasons, the 2° × 2° spatial resolution and 90-day time step is considered the optimal transition matrix configuration and was used in several subsequent calculations. The sensitivity of this transition matrix to data density was tested with a data withholding experiment. Those results are shown in appendix B.
The transition density of each grid cell (
b. Argo array prediction
Starting from the actual Argo float distribution of 10 October 2022, with 3262 floats, the transition matrix projected the float array density forward for 1 and 2 years. We show the current and projected float spacing of the array in Fig. 10 and resulting projected density maps for the Pacific, Atlantic, and Southern Ocean in Figs. 11–13, respectively. Argo floats older than 4 years were removed from the array estimate due to the high likelihood of poor sensor performance or float failure, as is a common practice by Argo managers. From these projections, large and growing holes in array distribution exist in the north-central and eastern equatorial Pacific; sparse distributions exist in the Benguela Current, and the middle of the North Atlantic subtropical gyre; and the Pacific sector of the Southern Ocean will become sparsely observed. Based on this October 2022 example, we would then recommend that Argo deployments prioritize ships transiting or conducting operations in these areas. Such a projection could be performed during each year’s Argo planning process.
As another example of the utility of transition matrices, we demonstrate an improvement of the density/age map currently calculated and used by Argo managers (https://www.ocean-ops.org/board) as a metric of core Argo array health. Repopulating old or sparse regions of the network is a goal of Argo managers; the density/age map displays the density of Argo floats within a grid cell divided by the average age of those Argo floats. Array health maps use a present snapshot of float distribution and do not estimate the future density/age map. Procurement and cruise organization occur many months before putting a float in the water, and estimates of how Argo array health will change in the future could improve planning. The optimal transition matrix can propagate the density/age map forward in time to assess the future distribution of the array, which has been done for the core Argo and SOCCOM arrays (Fig. 14). This example shows core Argo array health deficits in the Southern Ocean and off the east coast of Africa.
c. Estimating future array density from float deployments along set ship tracks
Research ships of opportunity often deploy Argo floats, with the ship tracks set by other projects. BGC Argo deployments are preferably from projects such as GO-SHIP that provide high-quality biogeochemical data that can be used as a reference to validate BGC sensor calibrations. To determine what fraction of the ocean future GO-SHIP cruises may populate with floats, grid cells containing GO-SHIP cruise track lines were initialized with floats at the time these GO-SHIP cruises are scheduled to sail (Table 2 and red lines in Fig. 15). The optimal transition matrix estimated ocean sampling in the next 5 years based on these deployments. Typically, only a handful of BGC floats are deployed on any cruise. We are showing the greatest possible extent by initializing all grid cells along the track line.
Planned GO-SHIP cruises.
The resultant sampling densities and mean Lagrangian pathways (Fig. 15) show that in many of the regions where floats could be deployed, they do not travel very far during their lifetimes. Based on these projections, GO-SHIP alone cannot populate the world with floats. Holes will exist in the eastern and western equatorial Pacific, the Gulf of Mexico, the Gulf Stream, and the western tropical Indian Ocean. Moreover, the decadal GO-SHIP transects are not occupied frequently enough to retain optimal sampling density. To achieve uniform distributions of the core Argo and BGC Argo arrays, these areas will need additional ships of opportunity for deployments.
d. Iridium versus Argos system communications
Long-term differences in Argos system and Iridium equipped float trajectories are well known due to the difference in surface transmission times (Wong et al. 2020). For the first time, the transition matrix methodology allows us to quantify the implications of the increased surface time of Argos-enabled floats distributed over many profiles. Argo floats transmit data through two distinct satellite constellations: Iridium and the Argos system. Floats have the hardware to transmit via one system or the other, but not both. Data transmission is much faster via the Iridium constellation. Consequently, Iridium-enabled floats spend about 15 min at the surface compared to their Argos counterparts, which can take up to 12 h. Surface velocities are also different from velocities at 1000 m depth (the typical Argo drift depth), and Argo floats are undrogued and advected by winds and waves while transmitting.
Dividing the full trajectory dataset into only Argos or Iridium enabled trajectories results in significantly less data density for both; we accommodate this by reducing spatial resolution. Transition matrices were constructed using 2° × 3° grid cells of latitude and longitude and a 180-day time step. The statistical difference between the Argos and Iridium enabled transition matrices is subtle and could not be distinguished from the null hypothesis by a Z test [Eq. (7)].
Transition matrices were multiplied by themselves 15 times to estimate the transition statistics after 8 years—the upper range of current Argo float lifetimes—to highlight the differences in transition statistics. We then uniformly seeded the World Ocean with theoretical Argo floats and considered the differences in resultant future float densities predicted with the transition matrix (Fig. 16). In the long-term estimates, the relative density of Iridium-enabled floats stays relatively uniform, and the regional differences in float density do not have a spatial structure consistent with known circulation. In contrast, the Argos system–derived prediction shows strong aggregation in the middle of the subtropical gyres and relative divergence of floats along the equator. This corresponds to divergence in surface currents similar to transition matrices derived from surface drifters (Van Sebille et al. 2012).
This analysis has several implications. First, Argos-enabled floats do not stay on the equator because they are more susceptible to the divergent Ekman transport caused by easterly trade winds. In the Roemmich et al. (2019) vision of the future Argo array, the equator is a prioritized region for increased float density. This analysis supports the decision made by Argo managers that floats deployed near the equator be equipped with Iridium communications to prevent them from being advected off the equator during their time at the surface. Second, Argos- and Iridium-enabled floats move differently, especially on long time scales: the performance of hybrid transition matrices derived from both types of floats may be degraded, primarily in equatorial regions.
For this reason, our data products provide Argos system and Iridium transition matrices separately. However, improved resolution and data density in hybrid transition matrices may offer enhanced performance for shorter-duration predictions. Because the current Argo fleet is composed of Iridium- and Argos-enabled floats, the hybrid matrix has general utility for predicting Argo fleet dynamics. Further, the innovation of new sensors has increased the quantity of data that BGC floats transmit; the increased data require an average time of an hour at the surface (S. Riser 2022, personal communication), and, depending on the sensor suite, traditional core Argo Iridium-enabled float statistics may underrepresent surface advection.
e. BGC Argo sampling predictions
Temperature and salinity sensors are ubiquitous within the Argo fleet, but recently developed and more expensive BGC sensors are not. BGC float managers need to know where specific BGC sensors will be when planning deployment cruises.
Motivated by the spatial inhomogeneities of BGC sensors, we estimate the future probability of sampling by sensor class. For these calculations, we used a transition matrix with 3° × 3° grid cells of latitude and longitude and a 90-day time step to match the designed separation of the core Argo array (Roemmich et al. 1998). This is a departure from the optimal transition matrix of section 4a and is used to match maps like Fig. 10 commonly used by Argo managers.
Using Eq. (2), the probability of current Argo sensors sampling in the next year is inhomogeneously distributed by various sensor classes (Figs. 17 and 18). Temperature and salinity sensors achieve global sampling over the course of a year (Fig. 17). Oxygen sensors are the second most widely deployed, with a mean chance of any annual sampling of 35.5% over the spatial domain and no ocean regions omitted (Fig. 18a). Chlorophyll is the third most widely deployed sensor, with a mean chance of any annual sampling of 21.8% over the spatial domain and a potential hole in the northeast Atlantic Ocean (Fig. 18b). Finally, pH is the most sparsely deployed sensor, with a mean chance of sampling any annual sampling of 15.8% over the spatial domain and holes in the Indian and northwest Pacific Oceans (Fig. 18c).
Another important metric to consider is the regions of the ocean that will be sampled year-round. Historically, BGC variables have only been sampled during hydrographic cruises with follow-up cruises years or decades later. Indeed, BGC float observations in the Southern Ocean have led to discoveries about the seasonal variability of BGC variables following fully resolved seasonal observations (Gray et al. 2018). Equation (3) was used to calculate the chance of year-round observation for temperature and salinity observations (Fig. 17). Temperature and salinity have the highest probability of year-round sampling, with a mean chance of 44.9% of the domain covered. The BGC array is not yet a fully developed network, and year-round sampling has thus far rarely been achieved. The year-round oxygen sampling has a chance of 1.2% of the domain covered. Chlorophyll and pH have a substantially smaller than 1% chance of year-round sampling in the domain.
Strong currents, such as the ACC, require a uniform density of float coverage to achieve year-round sampling (Davis 1991), and creative methods such as creating regional composites of observations to resolve seasonal signals (Gray et al. 2018) may be necessary for some time.
f. SOSE comparison
Models, including the SOSE, have been used to predict Lagrangian trajectories for both operational (Talley et al. 2019) and scientific (Tamsitt et al. 2017) applications. However, models generally do not reproduce Argo float dispersion well, even when the Lagrangian particles simulate the full Argo 10-day cycle (Talley et al. 2019). As a validation for both the SOSE model and the transition matrices, we compared transition matrices derived from SOSE model-based trajectories with the Argo-derived transition matrices.
Transition matrices composed of 1° × 2° and 4° × 6° grid cells of latitude and longitude with a 180-day time step recreated the upper and lower limits of available grid resolution for this region. Across both resolution levels, the zonal mean transitions in the ACC were greater in the SOSE-derived matrices. The 1° × 2° matrix comparison had a 1 cm s−1 increase in the ACC while the 4 × 6 comparison had a 0.5 cm s−1 increase. The second moment was also compared [Eq. (5), Fig. 19] and shows that SOSE consistently underrepresents ACC Lagrangian diffusion in the high-resolution case with a mean difference of −6.8 × 10−1 ± 6.2 × 10−1 cm2 s−2, and well resolves ACC diffusion in the low-resolution case with a mean difference of 0.00 ± 5.8 × 10−1 cm2 s−2.
SOSE is an eddy-permitting model but does not have sufficient resolution to fully resolve high-latitude eddies. This analysis suggests that SOSE ACC kinetic energy, at high resolution, is concentrated in the mean flow and does not sufficiently cascade into smaller-scale features; this manifests in lower diffusivity. The low-resolution case seems to have the effect of smoothing these differences. From this analysis, scientific conclusions derived from SOSE Lagrangian particle statistics should only be considered accurate for coarse-resolution studies. Changes in parameterized diffusivities in the offline Lagrangian model Octopus could potentially address these problems, but these changes have not been studied. A higher-resolution 1/12° gridcell solution is now also available and may improve mesoscale statistics, but this has not been tested.
5. Conclusions
In this paper, we have explained, justified, and tested the construction of a transition matrix for Argo float location prediction, following the surface drifter work by Van Sebille et al. (2012). Our work is in the broader context of BGC Argo global array design, and a companion paper will describe an optimal float deployment algorithm.
After quantifying a wide range of temporal and spatial biases and uncertainties, we have concluded that using the available Argo trajectory data, the transition matrix constructed from a 2° × 2° spatial resolution at a 90-day time step is optimal. This transition matrix is used for core Argo predictions, GO-SHIP deployment predictions, and array health products. A description of publicly available web applications and code repositories to predict future Argo float locations with the transition matrix can be found in appendix A.
We will update the transition matrix as more trajectory data are made available. The present transition matrix is available in appendix A. This transition matrix is a hybrid of Argos- and Iridium-enabled float trajectories.
We have shown a significant difference in transition matrices derived from floats equipped with different communication systems. We recommend that floats deployed in equatorial waters use Iridium communications. We additionally provide appendix A for the Argos and Iridium transition matrices for investigators who wish to make specific predictions based on communication type.
Finally, we compared the Argo transition matrices to transition matrices derived from a particle release experiment in the SOSE model. We found that the overall mean particle transition in the ACC was greater in the SOSE transition matrix, and particle diffusion was too low in the SOSE transition matrix at high resolution but consistent with the Argo-derived transition matrix at low resolution. We hypothesize that SOSE does not fully resolve the mesoscale eddy field.
The ever-growing Argo float dataset will continue to improve both the statistical accuracy and resolution of transition matrices. However, we find that the array is already of sufficient size for transition matrix construction, enabling significant insights into difficult questions that BGC Argo managers face now. BGC Argo floats offer new technology to answer questions of critical societal importance. We hope the transition matrix tools presented here will contribute to the ongoing community conversation regarding optimal array design.
Acknowledgments.
This work was supported by the SOCCOM project under NSF Award PLR-1425989, the Global Ocean Biogeochemical Array (GO-BGC) under NSF Award OCE-1946578, and the Hypernav project under NASA Award 80GSFC20C0101. Coauthors acknowledge NSF EarthCube Award 1928305. EvS was supported by the Netherlands Organization for Scientific Research (NWO), Earth and Life Sciences, through project OCENW.KLEIN.085. We thank Dr. Isa Rosso for providing SOSE Lagrangian particle trajectories. Prof. Donata Giglio (PI of the Argovis project) and Dr. William Mills (Argovis software engineer) at University of Colorado Boulder worked on including the product ARGONE (described in this paper) on the web app and database Argovis, including designing a new schema and API for the data and creating a demo notebook (https://github.com/argovis/demo_notebooks) to create Fig. 1b of the paper. Prof. Donata Giglio, Dr. William Mills, and Tyler Tucker were supported by NSF Awards 1928305 and 2026954. We also thank two anonymous reviewers for their insightful comments and suggestions.
Data availability statement.
The Argo Program (https://argo.ucsd.edu, http://argo.jcommops.org) is part of the Global Ocean Observing System. Data were collected and made publicly available by the International Global Ship-based Hydrographic Investigations Program (GO-SHIP; http://www.go-ship.org/) and the national programs that contribute to it. GO-SHIP cruise waypoints were collected from the CCHDO website (http://cchdo.ucsd.edu/). All code used in this publication is publicly available (https://github.com/Chamberpain/TransitionMatrix; Chamberlain 2023b).
APPENDIX A
Transition Matrix Web Applications and Repositories
Argo float location prediction is an observing system priority. Because of this, we worked with the Argovis team at University of Colorado Boulder to add the product ARGONE to the Argovis web app and database [https://github.com/argovis/demo_notebooks, recently upgraded from the app version described in Tucker et al. (2020)]. A demo notebook leveraging the new Argovis API to access the product ARGONE (described in this paper) and predict Argo float locations is available at https://github.com/argovis/demo_notebooks and the product will also be featured on the web app front end in the future.
The Argovis web app serves the statistical prediction of Argo float locations (using ARGONE) up to about 5 years in the future (the target float lifetime). Figure 1 shows an example of accessing ARGONE through Argovis.
The ARGONE GitHub repository (https://github.com/Chamberpain/ARGONE; Chamberlain 2023a) is publicly available and produces future probabilities of a float array. Figure A1 shows an example of these results.
APPENDIX B
Data Withholding Experiment
To test the data sensitivity of the recommended 2° × 2° grid cell at 90-day time step transition matrix, we performed a data withholding experiment. The data withholding experiment compared transition matrices created from subsets of the Argo trajectory dataset with the transition matrix made from the full Argo trajectory dataset. The subsets of the Argo trajectory database were generated by randomly withholding floats from the full Argo database. The dependence on data density was tested by increasing the number of withheld floats from 5% to 30% of the total number of Argo floats in 5% increments. The 30-, 60-, and 90-day transition matrices generated from these randomly generated subsets were compared to the original transition matrices 50 times at each data density. The difference in mean transition [Eq. (4)] between each data withheld matrix and the original transition matrix was calculated for every grid cell (Fig. B1).
There exist 2 distributions in the mean differences at 5%–15% withheld and 15%–30% withheld. These distributions have a mean difference of 0.37 and 0.11 km, respectively, for the 90-day transition matrix. The small difference in the 85%–95% distribution suggests that the mean of the full transition matrix may not be significantly sensitive to new data. The standard deviation of the difference is inversely proportional to the data density and larger than the mean difference for all cases considered.
REFERENCES
Abernathey, R., C. Bladwell, G. Froyland, and K. Sakellariou, 2022: Deep Lagrangian connectivity in the global ocean inferred from Argo floats. J. Phys. Oceanogr., 52, 951–963, https://doi.org/10.1175/JPO-D-21-0156.1.
Argo, 2022: Argo float data and metadata from Global Data Assembly Centre (Argo GDAC)—Snapshot of Argo GDAC of October 10st 2022. SEANOE, accessed 10 October 2022, https://doi.org/10.17882/42182#96550.
Boebel, O., J. Lutjeharms, C. Schmid, W. Zenk, T. Rossby, and C. Barron, 2003: The Cape Cauldron: A regime of turbulent inter-ocean exchange. Deep-Sea Res. II, 50, 57–86, https://doi.org/10.1016/S0967-0645(02)00379-X.
Chamberlain, P., 2023a: Chamberpain/Argone, version 1.0.0. Zenodo, https://doi.org/10.5281/zenodo.7623074.
Chamberlain, P., 2023b: Chamberpain/TransitionMatrix, version 1.0.0. Zenodo, https://doi.org/10.5281/zenodo.7623067.
Chamberlain, P., L. Talley, B. Cornuelle, M. Mazloff, and S. Gille, 2023: Optimizing the biogeochemical Argo float distribution. J. Atmos. Oceanic Technol., https://doi.org/10.1175/JTECH-D-22-0093.1.
Davis, R. E., 1991: Observing the general circulation with floats. Deep-Sea Res., 38A, (Suppl.), S531–S571, https://doi.org/10.1016/S0198-0149(12)80023-9.
Davis, R. E., L. A. Regier, J. Dufour, and D. C. Webb, 1992: The Autonomous Lagrangian Circulation Explorer (ALACE). J. Atmos. Oceanic Technol., 9, 264–285, https://doi.org/10.1175/1520-0426(1992)009<0264:TALCE>2.0.CO;2.
Drouin, K. L., M. S. Lozier, F. J. Beron-Vera, P. Miron, and M. J. Olascoaga, 2022: Surface pathways connecting the South and North Atlantic Oceans. Geophys. Res. Lett., 49, e2021GL096646, https://doi.org/10.1029/2021GL096646.
Ford, D., 2021: Assimilating synthetic biogeochemical-Argo and ocean colour observations into a global ocean model to inform observing system design. Biogeosciences, 18, 509–534, https://doi.org/10.5194/bg-18-509-2021.
Freeland, H. J., P. B. Rhines, and T. Rossby, 1975: Statistical observations of the trajectories of neutrally buoyant floats in the North Atlantic. J. Mar. Res., 33, 383–404.
Froyland, G., R. M. Stuart, and E. van Sebille, 2014: How well-connected is the surface of the global ocean? Chaos, 24, 033126, https://doi.org/10.1063/1.4892530.
Gille, S. T., and L. Romero, 2003: Statistical behavior of ALACE floats at the surface of the Southern Ocean. J. Atmos. Oceanic Technol., 20, 1633–1640, https://doi.org/10.1175/1520-0426(2003)020<1633:SBOAFA>2.0.CO;2.
Gray, A. R., and S. C. Riser, 2014: A global analysis of Sverdrup balance using absolute geostrophic velocities from Argo. J. Phys. Oceanogr., 44, 1213–1229, https://doi.org/10.1175/JPO-D-12-0206.1.
Gray, A. R., and Coauthors, 2018: Autonomous biogeochemical floats detect significant carbon dioxide outgassing in the high-latitude Southern Ocean. Geophys. Res. Lett., 45, 9049–9057, https://doi.org/10.1029/2018GL078013.
Johnson, K., and H. Claustre, Eds., 2016: The scientific rationale, design, and implementation plan for a biogeochemical-Argo float array. Argo Rep., 65 pp., https://archimer.ifremer.fr/doc/00355/46601/46508.pdf.
Johnson, K., and Coauthors, 2017: Biogeochemical sensor performance in the SOCCOM profiling float array. J. Geophys. Res. Oceans, 122, 6416–6436, https://doi.org/10.1002/2017JC012838.
Kamenkovich, I., A. Haza, A. R. Gray, C. O. Dufour, and Z. Garraffo, 2017: Observing system simulation experiments for an array of autonomous biogeochemical profiling floats in the Southern Ocean. J. Geophys. Res. Oceans, 122, 7595–7611, https://doi.org/10.1002/2017JC012819.
Lumpkin, R., L. Centurioni, and R. C. Perez, 2016: Fulfilling observing system implementation requirements with the global drifter array. J. Atmos. Oceanic Technol., 33, 685–695, https://doi.org/10.1175/JTECH-D-15-0255.1.
Majkut, J. D., B. R. Carter, T. L. Frölicher, C. O. Dufour, K. B. Rodgers, and J. L. Sarmiento, 2014: An observing system simulation for Southern Ocean carbon dioxide uptake. Philos. Trans. Roy. Soc., A372, 20130046, https://doi.org/10.1098/rsta.2013.0046.
Markov, A. A., 1906: Rasprostranenie zakona bol’shih chisel na velichiny, zavisyaschie drug ot druga. Izv. Fiz.-Mat. Obschestva Kazan. Univ., 15, 135–156.
Maximenko, N., J. Hafner, and P. Niiler, 2012: Pathways of marine debris derived from trajectories of Lagrangian drifters. Mar. Pollut. Bull., 65, 51–62, https://doi.org/10.1016/j.marpolbul.2011.04.016.
Mazloff, M. R., P. Heimbach, and C. Wunsch, 2010: An eddy-permitting Southern Ocean state estimate. J. Phys. Oceanogr., 40, 880–899, https://doi.org/10.1175/2009JPO4236.1.
McAdam, R., and E. van Sebille, 2018: Surface connectivity and interocean exchanges from drifter-based transition matrices. J. Geophys. Res. Oceans, 123, 514–532, https://doi.org/10.1002/2017JC013363.
Miron, P., F. J. Beron-Vera, M. J. Olascoaga, G. Froyland, P. Pérez-Brunius, and J. Sheinbaum, 2019: Lagrangian geography of the deep Gulf of Mexico. J. Phys. Oceanogr., 49, 269–290, https://doi.org/10.1175/JPO-D-18-0073.1.
Miron, P., F. J. Beron-Vera, and M. J. Olascoaga, 2022: Transition paths of North Atlantic Deep Water. J. Atmos. Oceanic Technol., 39, 959–971, https://doi.org/10.1175/JTECH-D-22-0022.1.
Morrison, A. K., T. L. Frölicher, and J. L. Sarmiento, 2015: Upwelling in the Southern Ocean. Phys. Today, 68, 27–32, https://doi.org/10.1063/PT.3.2654.
Ollitrault, M., and J.-P. Rannou, 2013: ANDRO: An Argo-based deep displacement dataset. J. Atmos. Oceanic Technol., 30, 759–788, https://doi.org/10.1175/JTECH-D-12-00073.1.
Roemmich, D., and Coauthors, 1998: On the design and implementation of Argo: An initial plan for a global array of profiling floats. International CLIVAR Project Office Rep., 32 pp.
Roemmich, D., and Coauthors, 2019: On the future of Argo: A global, full-depth, multi-disciplinary array. Front. Mar. Sci., 6, 439, https://doi.org/10.3389/fmars.2019.00439.
Sévellec, F., A. C. de Verdiére, and M. Ollitrault, 2017: Evolution of intermediate water masses based on Argo float displacements. J. Phys. Oceanogr., 47, 1569–1586, https://doi.org/10.1175/JPO-D-16-0182.1.
Talley, L. D., and Coauthors, 2019: Southern Ocean biogeochemical float deployment strategy, with example from the Greenwich meridian line (GO-SHIP A12). J. Geophys. Res. Oceans, 124, 403–431, https://doi.org/10.1029/2018JC014059.
Tamsitt, V., and Coauthors, 2017: Spiraling pathways of global deep waters to the surface of the Southern Ocean. Nat. Commun., 8, 172, https://doi.org/10.1038/s41467-017-00197-0.
Tucker, T., D. Giglio, M. Scanderbeg, and S. S. P. Shen, 2020: Argovis: A web application for fast delivery, visualization, and analysis of Argo data. J. Atmos. Oceanic Technol., 37, 401–416, https://doi.org/10.1175/JTECH-D-19-0041.1.
van Sebille, E., P. J. Van Leeuwen, A. Biastoch, and W. P. de Ruijter, 2010: On the fast decay of Agulhas rings. J. Geophys. Res., 115, C03010, https://doi.org/10.1029/2009JC005585.
van Sebille, E., M. H. England, and G. Froyland, 2012: Origin, dynamics and evolution of ocean garbage patches from observed surface drifters. Environ. Res. Lett., 7, 044040, https://doi.org/10.1088/1748-9326/7/4/044040.
van Sebille, E., and Coauthors, 2018: Lagrangian ocean analysis: Fundamentals and practices. Ocean Modell., 121, 49–75, https://doi.org/10.1016/j.ocemod.2017.11.008.
Wong, A. P. S., and Coauthors, 2020: Argo data 1999–2019: Two million temperature-salinity profiles and subsurface velocity observations from a global array of profiling floats. Front. Mar. Sci., 7, 700, https://doi.org/10.3389/fmars.2020.00700.