Bathymetry-Aware Mesoscale Eddy Parameterizations across Upwelling Slope Fronts: A Machine Learning–Augmented Approach

Chenyue Xie aDepartment of Ocean Science, The Hong Kong University of Science and Technology, Hong Kong, China

Search for other papers by Chenyue Xie in
Current site
Google Scholar
PubMed
Close
,
Huaiyu Wei aDepartment of Ocean Science, The Hong Kong University of Science and Technology, Hong Kong, China

Search for other papers by Huaiyu Wei in
Current site
Google Scholar
PubMed
Close
, and
Yan Wang aDepartment of Ocean Science, The Hong Kong University of Science and Technology, Hong Kong, China
bCenter for Ocean Research in Hong Kong and Macau, The Hong Kong University of Science and Technology, Hong Kong, China

Search for other papers by Yan Wang in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0000-0001-8064-2908
Open access

Abstract

Mesoscale eddy buoyancy fluxes across continental slopes profoundly modulate the boundary current dynamics and shelf–ocean exchanges but have yet to be appropriately parameterized via the Gent–McWilliams (GM) scheme in predictive ocean models. In this work, we test the prognostic performance of multiple GM variants in noneddying simulations of upwelling slope fronts that are commonly found along the subtropical continental margins. The tested GM variants range from a set of constant eddy buoyancy diffusivities to recently developed energetically constrained, bathymetry-aware diffusivities, whose implementation is augmented by an artificial neural network (ANN) serving to predict the mesoscale eddy energy based on the topographic and mean flow quantities online. In addition, an ANN is employed to parameterize the cross-slope eddy momentum flux (EMF) that maintains a barotropic flow field analogous to that in an eddy-resolving model. Our tests reveal that noneddying simulations employing the bathymetry-aware forms of the Rhines scale–based scheme and GEOMETRIC scheme can most accurately reproduce the heat contents and along-slope baroclinic transports as those in the eddy-resolving simulations. Further analyses reveal certain degrees of physical consistency in the ANN-inferred eddy energy, which tends to grow (decay) as isopycnal slopes are steepened (flattened), and in the parameterized EMF, which exhibits the correct strength of shaping the flow baroclinicity if a bathymetry-aware GM variant is jointly used. These findings provide a recipe of GM variants for use in noneddying simulations with continental slopes and highlight the potential of machine learning techniques to augment physics-based mesoscale eddy parameterization schemes.

Significance Statement

This study evaluates the predictive skill of parameterization schemes of water mass transports induced by ocean mesoscale eddies across continental slopes. Correctly parameterizing these transports in noneddying ocean models (e.g., ocean climate models) is crucial for predicting the ocean circulation and shelf–ocean exchanges. This work highlights the importance of bathymetric effects on eddy transports, as parameterization schemes that account for the influence of a sloping seafloor outperform those developed specifically for a flat-bottomed ocean. This work also highlights the efficacy of machine learning techniques to augment physics-based mesoscale eddy parameterization schemes, for instance, by estimating the mesoscale eddy energy online to realize energy-dependent parameterization schemes in noneddying simulations.

© 2023 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Chenyue Xie and Huaiyu Wei contributed equally.

Corresponding author: Yan Wang, yanwang@ust.hk

Abstract

Mesoscale eddy buoyancy fluxes across continental slopes profoundly modulate the boundary current dynamics and shelf–ocean exchanges but have yet to be appropriately parameterized via the Gent–McWilliams (GM) scheme in predictive ocean models. In this work, we test the prognostic performance of multiple GM variants in noneddying simulations of upwelling slope fronts that are commonly found along the subtropical continental margins. The tested GM variants range from a set of constant eddy buoyancy diffusivities to recently developed energetically constrained, bathymetry-aware diffusivities, whose implementation is augmented by an artificial neural network (ANN) serving to predict the mesoscale eddy energy based on the topographic and mean flow quantities online. In addition, an ANN is employed to parameterize the cross-slope eddy momentum flux (EMF) that maintains a barotropic flow field analogous to that in an eddy-resolving model. Our tests reveal that noneddying simulations employing the bathymetry-aware forms of the Rhines scale–based scheme and GEOMETRIC scheme can most accurately reproduce the heat contents and along-slope baroclinic transports as those in the eddy-resolving simulations. Further analyses reveal certain degrees of physical consistency in the ANN-inferred eddy energy, which tends to grow (decay) as isopycnal slopes are steepened (flattened), and in the parameterized EMF, which exhibits the correct strength of shaping the flow baroclinicity if a bathymetry-aware GM variant is jointly used. These findings provide a recipe of GM variants for use in noneddying simulations with continental slopes and highlight the potential of machine learning techniques to augment physics-based mesoscale eddy parameterization schemes.

Significance Statement

This study evaluates the predictive skill of parameterization schemes of water mass transports induced by ocean mesoscale eddies across continental slopes. Correctly parameterizing these transports in noneddying ocean models (e.g., ocean climate models) is crucial for predicting the ocean circulation and shelf–ocean exchanges. This work highlights the importance of bathymetric effects on eddy transports, as parameterization schemes that account for the influence of a sloping seafloor outperform those developed specifically for a flat-bottomed ocean. This work also highlights the efficacy of machine learning techniques to augment physics-based mesoscale eddy parameterization schemes, for instance, by estimating the mesoscale eddy energy online to realize energy-dependent parameterization schemes in noneddying simulations.

© 2023 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Chenyue Xie and Huaiyu Wei contributed equally.

Corresponding author: Yan Wang, yanwang@ust.hk

1. Introduction

Mesoscale eddies play a central role in the oceanic transport and mixing processes (Lee et al. 2007; Gruber et al. 2011; Abernathey et al. 2013; Gnanadesikan et al. 2015; Busecke and Abernathey 2019; Jones and Abernathey 2019), and interact strongly with the large-scale ocean circulation (Hallberg and Gnanadesikan 2006; Waterman et al. 2011; Jansen and Held 2014; Vallis 2017; Jansen et al. 2019; Holmes et al. 2022). Because of their relatively small size (10–100 km), these eddies cannot be adequately resolved in predictive ocean climate models. Most ocean climate models are still adopting a lateral grid spacing of ∼1° to accommodate large-ensemble runs and lengthy integrations between centuries and millennia (e.g., Farneti et al. 2015; Mak et al. 2022). While an increasing number of global ocean models are adopting the so-called “eddy-permitting” horizontal grid resolutions (∼1/4°), these models can only resolve the ocean mesoscale in the tropical open ocean (Hallberg 2013; Griffies et al. 2015; Yankovsky et al. 2022), where the first Rossby deformation radii are the largest (Chelton et al. 1998). Consequently, there remains a pressing need to parameterize ocean mesoscale eddies in today’s ocean climate models.

Among the various mesoscale eddy processes capable of shaping the large-scale ocean circulation and tracer budgets, eddy buoyancy fluxes, which rearrange fluid parcels by flattening isopycnal surfaces adiabatically and thus extracting the large-scale potential energy, are routinely parameterized in ocean climate models via the Gent and McWilliams (1990) parameterization scheme (GM scheme hereafter). This scheme specifies an eddy buoyancy diffusivity to regulate the parameterized eddy buoyancy fluxes based on the large-scale buoyancy gradients resolved in ocean climate models. To date, substantial efforts have been made to improve the efficacy of the GM scheme by refining the parameter dependency of the eddy buoyancy diffusivity using theoretical tools (e.g., Ferrari et al. 2010; Marshall et al. 2012) and eddy-resolving simulations (e.g., Jansen et al. 2015; Bachman et al. 2017; Wei et al. 2022).

Established understanding about the mesoscale energy cycle has motivated recent works to couple the GM scheme with a subgrid-scale mesoscale eddy kinetic energy (EKE) budget (Eden and Greatbatch 2008; Cessi 2008; Marshall and Adcroft 2010; Jansen et al. 2015, 2019; Kong and Jansen 2021), which provides an eddy velocity scale that can be combined with a prescribed eddy length scale to formulate the eddy buoyancy diffusivity based upon the mixing length theory (Prandtl 1925). In parallel, the eddy total (potential plus kinetic) energy was shown to serve as a mathematical upper bound for the norm of the Eliassen–Palm eddy stress tensor in quasigeostrophic (QG) turbulence, from which the GEOMETRIC framework of eddy parameterization was developed (Marshall et al. 2012). Implementation of the GEOMETRIC framework for parameterizing eddy buoyancy fluxes relies on the combination of the subgrid-scale eddy total energy with the Eady (1949) time scale subject to a nondimensional prefactor, whose magnitude is bounded by unity (Mak et al. 2017, 2018). Employing these energetically constrained GM schemes led to substantially improved fidelity of coarse-resolution ocean models in forecasting the large-scale ocean stratification and circulation in response to the changing atmospheric forcing, as exemplified by studies of the Southern Ocean (Mak et al. 2017, 2018; Jansen et al. 2019; Kong and Jansen 2021).

Ideally, an energetically constrained GM scheme would maintain its efficacy in regions where mesoscale eddies are vigorous but the Rossby deformation radii become small. Such regions are found primarily over continental margins or at high latitudes (Hallberg 2013; LaCasce and Groeskamp 2020). Notably, key processes that drive the global ocean circulation, such as the deep convection in the Nordic seas and the onshore heat transport near the Antarctic margin, are mediated by mesoscale eddies across high-latitude continental slopes (Spall 2004, 2010; Thompson et al. 2018). In the subtropics, mesoscale eddies are found to regulate the marine ecosystems by transporting the upwelled nutrients offshore across the eastern boundary upwelling systems (Gruber et al. 2011; Nagai et al. 2015), a process yet to be appropriately represented in noneddying ocean simulations (Moscoso et al. 2021). Given that the upcoming generation of ocean climate models will represent coastal and shelf seas (Holt et al. 2017), it is imperative to accurately parameterize the eddy-induced transports across continental slopes in noneddying ocean simulations.

In line with the need for parameterizing the cross-slope eddy-induced transports, Wang and Stewart (2020) proposed several scalings for the eddy buoyancy diffusivity across continental slopes under upwelling-favorable winds. These scalings were directly adapted from the aforementioned GM variants of the EKE-based mixing length theory (Jansen et al. 2015) and the GEOMETRIC framework (Marshall et al. 2012) by retaining the key parameter dependencies of these GM variants but recasting the otherwise constant scaling prefactors as analytical functions of a slope parameter, defined as the ratio between the topographic slope and the depth-averaged isopycnal slope (Isachsen 2011). These scalings were validated against a suite of eddy-resolving simulations of continental slope flows driven by upwelling-favorable winds and shown to reproduce the depth-averaged eddy buoyancy diffusivities diagnosed from these simulations.

To convert the scalings of Wang and Stewart (2020) into actual parameterizations, two major challenges must be addressed. First, the mesoscale eddy energy, which enters the formulation of these scalings, must be parameterized. The aforementioned subgrid-scale eddy energy budgets (Mak et al. 2018; Jansen et al. 2019) have been constructed based on open ocean eddy properties. Yet for upwelling slope fronts, Wang and Stewart (2018) identified strong onshore EKE fluxes in the upper ocean and topographically rectified flows associated with the conversion of EKE into the large-scale potential energy. Both processes are yet to be well constrained by existing subgrid-scale eddy energy budgets. Second, cross-slope eddy fluxes of along-slope momentum were found to displace the along-slope currents toward the open ocean in simulations of upwelling slope fronts (Wang and Stewart 2018; Manucharyan and Isachsen 2019). If the cross-slope eddy momentum fluxes (EMF hereafter) were neglected, even with accurately parameterized eddy buoyancy fluxes that are typically suppressed over steep slopes (Isachsen 2011), a noneddying simulation may produce overly tilted isopycnals and thus inaccurate baroclinicity of the large-scale flow. Conventional treatment to mesoscale EMF using a gridscale viscosity (e.g., Ferreira et al. 2005; Ferreira and Marshall 2006) is unlikely to be useful over steep slopes, as the momentum fluxes were found to be directed offshore in upwelling slope fronts, accelerating the onshore flanks but decelerating the offshore flanks of along-slope currents (Manucharyan and Isachsen 2019), and make a negligible contribution to the energy conversion between the meso- and large-scale flows (Wang and Stewart 2018).

To circumvent the challenge of implementing an eddy energy budget across steep continental slopes, this study employs a data-driven approach to aid in the numerical implementation and prognostic testing of energetically constrained scalings for cross-slope eddy buoyancy diffusivity. Data-driven approaches have been increasingly exploited to develop subgrid-scale models of turbulence (Papale and Valentini 2003; Ling et al. 2016; Xiao et al. 2016; Duraisamy et al. 2019), oceanic processes (Bolton and Zanna 2019; Zanna and Bolton 2020; Salehipour and Peltier 2019; Guan et al. 2022), and atmospheric processes (e.g., Rasp et al. 2018; Yuval and O’Gorman 2020). These approaches have also shown a great potential to augment the information extraction from spatially or/and temporally confined ocean observational records, for instance, by inferring the subsurface flows and eddy fluxes based on sea surface flow information (Manucharyan et al. 2021; George et al. 2021).

In this work, we will specifically employ the fully connected artificial neural network (ANN; Gamahara and Hattori 2017; Xie et al. 2020, 2021; Maulik et al. 2021) to construct the nonlinear relationship of the eddy quantities of interest (i.e., mesoscale eddy energy and cross-slope momentum fluxes) with the mean flow and topographic quantities using the output dataset of the eddy-resolving simulations of Wang and Stewart (2018; similar ideas were proposed by Partee et al. (2022), who adopted machine learning approaches to reconstructing the surface mesoscale eddy kinetic energy produced by an eddy-resolving global ocean model). These ANN-learnt eddy quantities are then used to formulate the scalings of the cross-slope eddy buoyancy diffusivity proposed by Wang and Stewart (2020) and to parameterize the eddy Reynolds stress due to the cross-slope EMF. Upon implementation into noneddying ocean simulations, the trained neural networks facilitate the realization of a “hybrid” eddy parameterization scheme that aims to reproduce the baroclinicity of the large-scale slope flow field as in the eddy-resolving simulations. Our proposed scheme is hybrid for that it integrates machine learning techniques into physics-based scalings of eddy buoyancy fluxes, which therefore supplements existing parameterization schemes using subgrid-scale eddy energy budgets to formulate eddy buoyancy diffusivities (Eden and Greatbatch 2008; Marshall and Adcroft 2010; Mak et al. 2017, 2018, 2022; Jansen et al. 2019; Kong and Jansen 2021).

The rest of this article is organized as follows. Section 2 documents the model setup employed in this work. In section 3, we review the theoretical background for the GM-based eddy parameterization schemes and describe the approaches to implementing different GM variants into our prognostic simulations. In section 4, we quantify the predictive skill and online characteristics of selected GM variants, analyze the physical consistency of the ANN-inferred eddy energy, and assess the influence of ANN-based mesoscale eddy Reynolds stress on the predicted slope flow baroclinicity. In section 5, we discuss the utility and limitations of our proposed eddy parameterization approaches. Concluding remarks follow in section 6.

2. Model setup

In this section, we overview the model setup of the three-dimensional (3D) eddy-resolving simulations of continental slope flows (see also Wang and Stewart 2018, 2020), whose output dataset facilitated the development of the scalings for eddy buoyancy fluxes across upwelling slope fronts and will be used for training the artificial neural networks in this work. Table 1 summarizes the values of physical parameters adopted by our reference model run.

Table 1.

List of parameters used in the reference continental slope model experiment. Italics indicate parameters that are independently varied among different model experiments.

Table 1.

a. Model domain and grid spacing

All simulations in this work build upon the MIT general circulation model (MITgcm hereafter; Marshall et al. 1997), which integrates the Boussinesq hydrostatic momentum equations coupled with a linear equation of state that depends on potential temperature only.

The reference 3D run is constructed with an f-plane (with a Coriolis frequency of f0 = 1 × 10−4 s−1), zonally symmetric channel, which includes a continental shelf and slope located on the southern half of the model domain (Fig. 1a). However, the shelf and slope can be viewed as being orientated toward any direction in the absence of the planetary vorticity gradient. More precisely, a hyperbolic tangent function is used to prescribe the model bathymetry,
H(y)=Zs12Hstanh(yYsWs),
where y ∈ [0, Ly] denotes the latitude (offshore distance), Zs = 2250 m represents the slope middepth, Hs = 3500 m measures the shelf height, and Ws = 50 km stands for the slope half-width. The model domain covers Ly = 500 km in the cross-slope direction and Lx = 800 km in the along-slope direction. As in Wang and Stewart (2018), the terms “zonal” and “along-slope” will be used interchangeably throughout this work, and similarly for the terms “meridional” with “cross-slope.”
Fig. 1.
Fig. 1.

Illustration of the reference model configuration [see also Fig. 1 of Wei and Wang (2021)]. (a) A snapshot of sea surface potential temperature θ|z=0 (colors), the instantaneous location of the isotherm θ = 1°C (blue sheet), and the bathymetry (gray sheet). (b) Logarithm of eddy kinetic energy per unit mass (colors) superposed by mean alongshore velocity profiles (solid contours with a selected interval of 0.1 m s−1). Dashed black lines indicate the latitudes, y = 150 and 250 km, at which the shelf/slope and slope/open ocean are delineated. The northern sponge layer is shadowed and indicated by text. Blue curve in the upper panel indicates the surface wind stress profile across the model domain, with the negative sign corresponding to the upwelling-favorable wind direction.

Citation: Journal of Physical Oceanography 53, 12; 10.1175/JPO-D-23-0017.1

The horizontal model grid resolution is 2 km. In the vertical, 70 geopotential levels with grid spacing ranging from 10 m at the sea surface to over 100 m at the seafloor are used. To improve the representation of near-bottom flows over the shelf and slope, partial grid cells are enabled with a minimum nondimensional fraction of 0.1 (Griffies et al. 2000).

b. Forcing and boundary conditions

A steady zonal wind stress is imposed at the sea surface, with the profile prescribed as
τ=τ0sin2(πy/Lτ),0<y<Lτ,
where τ0 = 0.05 N m−2 denotes the maximum wind stress located at the midslope position y = Ys = 200 km [the negative sign in Eq. (2) corresponds to a westward or upwelling-favorable wind direction], and Lτ = 400 km measures the width of wind stress in the cross-slope direction. A quadratic bottom drag with a coefficient of Cd = 2.5 × 10−3 is imposed at the seafloor, which allows for momentum and energy extraction.
Periodic boundary conditions are used in the along-slope direction. The northern and southern lateral boundaries are constrained by no-normal-flow conditions. To maintain the vertical stratification in the deep open ocean, the potential temperature is restored toward a reference exponential profile across a 50-km-wide sponge layer at the northern boundary, with a minimum relaxation time scale of 7 days. This yields a first baroclinic Rossby deformation radius,
Ld=|H|0Ndzπf0,
of approximately 18 km in the deep open ocean with N denoting the buoyancy frequency.

c. Advection scheme and subgrid parameterizations

The momentum equations are solved via a finite volume formulation in the vector-invariant form (Vallis 2017), and the potential temperature is advected using the second-order-moment scheme of Prather (1986) to minimize the spurious mixing resulting from numerical truncation errors (Hill et al. 2012). A horizontal biharmonic viscosity with the Courant–Friedrichs–Lewy number of 0.1 is employed to maintain the numerical stability (Stewart and Thompson 2015). Last, a large implicit vertical diffusivity (100 m2 s−1) is imposed to parameterize convective instabilities.

d. Parameter perturbation experiments and noneddying simulations

Following Wang and Stewart (2018), our analyses account for a suite of parameter perturbation experiments, as listed in Table 2, which provide ample output data to train the neural networks and cover a broad range of parameter space for assessing the predictive skill of different GM variants. Each of these parameter perturbation experiments is constructed with one key physical parameter (i.e., the peaking wind strength, the thermal expansion coefficient, or the slope half-width) “perturbed” relative to that of the reference model run.

Table 2.

Physical parameters varied among the continental slope model experiments. Italic values of parameters deviate from their reference values. For parameter definitions, the reader is referred to Table 1.

Table 2.

To test the scalings of Wang and Stewart (2020) prognostically, we further construct a suite of two-dimensional (2D) noneddying simulations. These 2D simulations inherit the configurations of the 3D runs (Tables 1 and 2) but use only one model grid in the along-slope direction, which shuts off all mesoscale processes (Vallis 2017) and eases the comparison with the 3D eddy-resolving simulations. Adopting a coarsened lateral grid spacing of 16 km leads to quantitatively consistent results to those documented below (not shown). In addition, the minimum fraction of the partial grid cell in these 2D runs is adjusted to 0.3 to avoid excessively thin vertical grid cells that are prone to numerical instabilities.

In section 5c, we further construct a set of supplementary, flat-bottomed simulations by replacing the sloping bathymetry in the reference model setup with a range of planetary vorticity gradients. These flat-bottomed simulations are divided into the 3D eddy-resolving and 2D noneddying categories as for the continental slope simulations, and are used to test the utility of our proposed and existing parameterization approaches across an even broader range of physical parameter space.

Similar paired-up 2D and 3D runs have been constructed to examine parameterization schemes for eddy buoyancy fluxes (Visbeck et al. 1997), eddy potential vorticity (PV) fluxes (Wardle and Marshall 2000), and eddy momentum (Eden 2010) fluxes, and to investigate the submesoscale eddy effects under sea ice leads (Cohanim et al. 2021).

e. Model integration

For the 3D model setup, each simulation was spun up with a coarse 4-km resolution from a resting state for 35 model years to arrive at a statistically steady state, as verified from the time series of domain-integrated total kinetic energy and heat content. The 4-km model solutions were then interpolated onto a 2-km grid and rerun for another 20 years until a dynamic equilibrium is reestablished. Daily outputs from the final 5 model years are adopted for analyses throughout this work.

For the 2D model setup, each simulation was spun up with a constant eddy buoyancy diffusivity (70 m2 s−1) but no parameterization of EMF for 120 years, after which the constant diffusivity was substituted by one of our selected GM variants (cf. section 3a) and rerun (with or without the parameterized EMF) for another 80 years to reestablish dynamic equilibrium. Spinning up these simulations with our selected GM variants (Table 3) from a resting state yielded negligible changes in the final equilibrated model solutions (not shown). Since the flow in a 2D simulation eventually evolved into a fully steady state after the spinup phase, the model outputs of the final time step are used for analyses in this work.

Table 3.

List of the GM variants tested in this study. The constant eddy buoyancy diffusivity is given by Eq. (5); the Visbeck et al. (1997) GM variant is defined by Eq. (6); the mixing length theory-based GM variant is defined by Eq. (7); the GEOMETRIC GM variant is defined by Eq. (9); the slope-aware mixing length theory-based GM variant is defined by Eq. (12); the slope-aware GEOMETRIC GM variant is defined in Eq. (13).

Table 3.

3. Theoretical background and parameterizations

a. Eddy buoyancy diffusivity and GM variants

This work focuses primarily on the prognostic performance of the GM scheme that builds upon a downgradient diffusive closure to parameterize the depth-averaged, cross-slope eddy buoyancy fluxes, expressed as (Wang and Stewart 2020; Kong and Jansen 2021),
Fb=Kbb¯y,
where b = −g(ρρ0)/ρ0 represents the buoyancy (ρ0 is the Boussinesq reference density), Kb denotes the eddy buoyancy diffusivity, is the depth-average operator, and ¯ stands for an ensemble-mean operator, which was taken as the time- and zonal-mean in the 3D eddy-resolving simulations (Wang and Stewart 2020) but should be viewed as a zonal-mean operator for the 2D noneddying simulations (e.g., Wardle and Marshall 2000). Parameterization of Fb in a noneddying simulation then hinges upon the parameterization of the eddy buoyancy diffusivity Kb. Following previous assessments of GM-based parameterization schemes (Visbeck et al. 1997; Mak et al. 2017, 2018; Jansen et al. 2019; Kong and Jansen 2021), we consider a set of GM variants characterized by increasingly sophisticated parameter dependencies.
The simplest GM variant in our test is a set of constant eddy buoyancy diffusivities,
Kconst[30,50,70,90,110,130,200]m2s1.
The chosen magnitudes of Kconst are smaller than those tested in previous works [e.g., 1500 m2 s−1 in Mak et al. (2018)], and are motivated by the diagnosed depth-averaged eddy buoyancy diffusivity in the 3D reference simulation, whose value ranges across [3, 26] m2 s−1 over the continental slope (y ∈ [150, 250] km) and grows to ∼294 m2 s−1 at y = 350 km (cf. black dots in Fig. 7). The weakened eddy buoyancy diffusivity across steep slopes agrees with previously reported suppression effects of a sloping ocean bed on baroclinic eddy buoyancy fluxes (e.g., Isachsen 2011; Chen and Kamenkovich 2013; Manucharyan and Isachsen 2019).
Our tested GM variants also include the scheme of Visbeck et al. (1997), who formulated the eddy buoyancy diffusivity as
Kvisb=αvisbl2b¯y/N,
where αvisb = 0.015 denotes a nondimensional constant prefactor, and l is defined as the width of the baroclinic zone, taken empirically to be 30 km in our 2D simulations. The chosen width of the baroclinic zone yields a similar magnitude of Kvisb to the set of constant eddy buoyancy diffusivities Kconst (e.g., the meridionally averaged value of Kvisb is diagnosed as ∼68 m2 s−1 in the 3D reference experiment). Notice that Kvisb depends only on the large-scale flow quantities resolved in noneddying simulations and can thus be tested prognostically without any additional closure for the mesoscale eddy energy.

Two more sophisticated GM variants than the Visbeck et al. (1997) scheme follow from the EKE-based mixing length theory (Jansen et al. 2015) and the GEOMETRIC framework (Marshall et al. 2012) (cf. section 1). Both parameterizations, alongside their bathymetry-aware forms described below, depend on the mesoscale eddy energy solved prognostically in the 2D noneddying (e.g., Mak et al. 2017) or 3D coarse-grid (e.g., Jansen et al. 2019; Kong and Jansen 2021; Mak et al. 2018, 2022) ocean models.

The mixing length theory-based GM variant tested by Jansen et al. (2015) can be expressed as (see also Jansen et al. 2019)
KMLT=αmlt2EKELRh,
LRh=2EKE1/2/βt,
βt=|f0HHy|,
where αmlt denotes a nondimensional tunable parameter of O(0.1), and LRh is the topographic Rhines scale quantified by the depth-averaged EKE per unit mass and the topographic PV gradient βt. Two representative values of αmlt are tested in this work: αmlt = 0.12 as in the prognostic simulations of Jansen et al. (2019) and Kong and Jansen (2021) (both works accounted for coarse-grid idealized simulations with sloping ocean bottoms), and αmlt = 0.33 based on the eddy-resolving model diagnostics of Jansen et al. (2015) and Wang and Stewart (2020).
In the 3D eddy-resolving simulations, the EKE per unit mass is calculated as (Wang and Stewart 2018)
EKE=12(u2¯+υ2¯),
where =¯ denotes the deviation from the time and zonal mean, and u and υ are the along-slope and cross-slope velocity components, respectively. The EKE defined in (8) constitutes a key eddy quantity to be learnt by ANN.
The GEOMETRIC GM variant takes the form
KGEOM=αgeomN2|b¯y|E,
where αgeom denotes a nondimensional prefactor with its magnitude bounded by unity, and E stands for the total (kinetic plus potential) eddy energy per unit mass. Two specific values of αgeom are examined in this work: αgeom = 0.04 as used in the coarse-grid idealized simulation of the Southern Ocean by Mak et al. (2018) and in the coarse-grid global ocean simulation of Mak et al. (2022) (this value of αgeom is also close to that in the Southern Ocean as diagnosed from an eddy-permitting global ocean model; see Poulsen et al. 2019), and αgeom = 0.08 based on the eddy-resolving model diagnostics of Wang and Stewart (2020).
Bachman et al. (2017) and Wang and Stewart (2020) found that the skill of KGEOM in quantifying the eddy buoyancy fluxes across a flat-bottomed ocean or a continental slope region is insensitive to the eddy energy type employed. That is, the total eddy energy E in (9) can be readily replaced by the EKE subject to a constant of proportionality. Based on the eddy-resolving model diagnostics of Wang and Stewart (2020, see their Figs. 4c,d), we set
E3EKE
to ease the training of the neural network.
To adapt KMLT and KGEOM to continental slopes under upwelling-favorable winds, Wang and Stewart (2020) proposed to integrate the topographic impact on eddy buoyancy fluxes into the prefactors of both GM variants via analytical functions of a slope parameter (cf. section 1), calculated as
δ=|Hy/Sisoz|,
Sisoz=b¯y/N2,
where Sisoz denotes the depth-averaged isopycnal slope, and the absolute sign in (11a) corresponds to the case of an upwelling slope front featured by isopycnals tilted in the same direction as the seafloor.
The mixing length theory-based GM variant is recast as
KMLTslope=γmltFmlt(δ)2EKELRh,
Fmlt(δ)=δ+1δ+Γm,
where γmlt = 3.3 × 10−3 and Γm = 0.01 are two empirically derived nondimensional constants. In the case of flat bottom (i.e., δ = 0), the formulation of the scaling (12) reduces to that of (7), with the prefactor coefficient fixed at 0.33 (i.e., the selected upper bound of αmlt in our prognostic simulations).
In parallel, the GEOMETRIC GM variant is reformulated as
KGEOMslope=γgeomFgeom(δ)N2|b¯y|E,
Fgeom=Ψtanh(Γgδ)+1δ+Γg,
where γgeom = 8.0 × 10−3, Ψ = 1.35, and Γg = 0.10 are three nondimensional constants derived from the 3D MITgcm eddy-resolving simulations (Wang and Stewart 2020). In the limit of a flat-bottomed ocean (i.e., δ = 0), the formulation of the scaling relation (13) reduces to that of (9), with the prefactor coefficient fixed at 0.08 (i.e., the selected upper bound of αgeom in our prognostic simulations).

Table 3 summarizes the formulations of all GM variants considered in this study, among which KGEOMslope and KMLTslope are specifically termed “bathymetry-aware” or “slope-aware” GM variants.

b. ANN-augmented eddy closures

To convert the aforementioned energetically constrained GM variants, (7), (9), (12), and (13), into numerically implementable eddy closures, the EKE must be parameterized in noneddying simulations. To this end, we train an ANN to learn the EKE [Eq. (8)] from the output dataset of the 3D eddy-resolving runs.

To account for the full eddy stresses exerted on the simulated flow, an additional ANN is trained to learn the cross-slope EMF per unit mass (Bolton and Zanna 2019), calculated as
EMF=uυ¯,
in our 3D eddy-resolving simulations.
The structure of ANN is presented in Fig. 2, which is largely similar to that employed by Xie et al. (2023). It constructs the nonlinear relationship between an eddy-related quantity of interest (i.e., EKE or EMF in this work) and selected mean flow and topographic quantities. The neurons of the ANN input layer are allocated by selected mean flow and topographic quantities, and the ANN final output should converge toward the eddy related quantity of interest. Within the neural network, inputs Xil1 launched from layer l − 1 arrive at layer l, which then construct outputs Xil via a combination of linear and nonlinear operators (Maulik and San 2017; Xie et al. 2021). More precisely, the transfer function linking the ANN layers follows
Xil=σ(sil+bil),
sil=jWijlXjl1,
where σ denotes the activation function, Wijl represents the weight parameter, and bil stands for the bias parameter. Throughout the training procedure, the ANN updates Wijl and bil until the targeted eddy-related quantity can be accurately approximated by the final output.
Fig. 2.
Fig. 2.

Schematic diagram of the structure of the ANN.

Citation: Journal of Physical Oceanography 53, 12; 10.1175/JPO-D-23-0017.1

In this work, two ANNs are trained to learn the EKE and EMF separately. This is motivated by the distinct characteristics of both quantities: EKE must be positive definite, whereas EMF may not. Each neural network includes an input layer, two hidden layers, and an output layer, with the corresponding neuron numbers set as (M: 64: 32: 1). Here M represents the neuron numbers of the input layer, determined by the number of input variables. The leaky rectified linear unit (ReLU) activation function,
σ(a)={aifa>0,0.2aifa0,
is used to activate the two hidden layers, with a denoting the functional argument. The linear operation, σ(a) = a, is used to activate the output layer.

During the training of the ANN, the loss function is defined as the mean-squared error of the ANN output relative to the eddy quantity of interest diagnosed from the 3D simulations across the depth/latitude plane. The loss function is constrained further by the L2 regularization (the regularization strength is set to 10−4) (Hoerl and Kennard 1970), which introduces a penalty term and thus limits the squared norm of the ANN weight parameters to avoid overfitting (Gamahara and Hattori 2017; Duraisamy et al. 2019).

To accurately model the eddy-related quantities of interest (i.e., the EKE and EMF), properly chosen mean flow and topographic quantities are essential for allocations of neurons in the input layer. In this work, these quantities are chosen based on our understanding of mesoscale turbulent features over continental slopes. Specifically, mesoscale turbulence can derive its energy from the lateral and vertical shears of the along-slope flow. These shears are calculated as u¯/y and u¯/z in our simulations. The mean flow shears have also been widely used to reconstruct the Reynolds stress in turbulence models via machine learning techniques (Duraisamy et al. 2019). The vertical shear of the along-slope flow is directly linked to the cross-slope buoyancy gradient via the thermal wind balance, which, together with the vertical buoyancy gradient, controls the depth-averaged isopycnal slope Sisoz at each latitude. The depth-averaged isopycnal slope is a useful metric for the flow baroclinicity and thus assigned as one input variable. A sloping seafloor can suppress mesoscale eddy fluxes (e.g., Isachsen 2011; Chen and Kamenkovich 2013; Manucharyan and Isachsen 2019) and decouple surface eddy processes from those at depth (e.g., LaCasce 2017; LaCasce and Groeskamp 2020), and should therefore be accounted for by the neural network. Here we pick the topographic PV gradient βt as the metric for the seafloor steepness. To capture the vertical variations of EKE and EMF, we further define a normalized height, z=|(Hmax|z|)/Hmax|, as an input quantity, where Hmax = 4000 m is the maximum ocean depth in our model domain (Table 1). Last, the first Rossby deformation radius Ld is selected as an input variable, as it indicates the length scale at which baroclinic unstable waves are excited (Vallis 2017) and the shelf-to-ocean transition of the nonlinear eddy length scale (Stewart and Thompson 2016).

To maintain the robustness of the ANN training procedure, the training targets of ANN are cast as the squared roots of the magnitudes of EKE and EMF per unit mass (the sign of EMF is further accounted for), which are closer to O(1) than EKE and EMF themselves. Moreover, the input and output variables are empirically nondimensionalized (Ling and Templeton 2015; Xiao et al. 2016; Maulik and San 2017; Wang et al. 2017; Vollant et al. 2017; Xie et al. 2020, 2021) by a reference length scale LRef=(1/Ly)0LyLd(y)dy, a reference velocity scale URef=(1/Ly)0Ly|Utw(y)|dy (here Utw=|H|0yb¯/f0dz denotes the thermal wind velocity), the Coriolis frequency f0, and a reference buoyancy frequency N0 = 10−3 s−1. In Table 4, we summarize the selected input and output quantities and their nondimensionalized counterparts. In appendix A, we present the input variables and targeted eddy-related quantities (EKE and EMF) as functions of depth and offshore distance in the reference experiment, and examine the utility of linear models in predicting the eddy-related quantities.

Table 4.

The raw (first row) and normalized (second row) forms of input and output variables for the ANN. The normalization is made via the empirically selected length scale LRef, velocity scale URef, and buoyancy frequency N0; sgn() represents the signum function.

Table 4.

In this work, the time- and zonal-mean solutions of the 3D MITgcm simulations serve as the dataset to formulate the input and output variables of the ANN. Following Bolton and Zanna (2019), we select a subset of 3D simulations (i.e., Ref., 0.5τ0, 2τ0, 0.5Ws, 1.5Ws, 2Ws) for providing the training dataset, of which 70% is used for directly training the neural network and 30% is used for validation. The rest of the 3D simulations (i.e., 1.5τ0, 0.66Ws, 0.5αθ, 1.5αθ, 2αθ) will be used to examine the generality of the ANN in inferring the EKE or EMF across upwelling slope fronts whose physical parameters deviate from those in the training dataset. As in Xie et al. (2021, 2023), the ANN learning rate and batch size are set to 10−3 and 103, respectively, for updating Wijl and bil. Last, early stopping is enabled (i.e., the training procedure would exit once the validation error ceases to drop for 50 epochs) for constraining the network training via the Adam algorithm (Kingma and Ba 2014; Maulik et al. 2021). The training and testing losses of the EKE and EMF exhibit analogous converging behavior and correlate closely after 200 global iterations (not shown). Therefore, the ANN learning procedure can be deemed reasonable.

c. Implementation of ANN-augmented eddy closures

With the ANN for inferring the EKE from the large-scale flow and topographic quantities readied, the energetically constrained GM variants are implemented in MITgcm (source code checkpoint 68i) by taking the following coding steps:

  1. Within the GMREDI package of MITgcm, modify the initialization subroutine (gmredi_readparms.F) to read in the neural network [i.e., Wijl and bil following (15)], which is saved as arrays upon the completion of training procedure.

  2. Within the GMREDI package, code up a new subroutine (gmredi_calc_ml.F) that calculates the selected input variables (Table 4) and infers the EKE in the 2D runs at each model time step.

  3. Within the subroutine of gmredi_calc_ml.F, formulate the GM variants, (7), (9), (12), and (13), using the ANN-inferred EKE and the MITgcm-simulated quantities in 2D runs (Table 3) at each model time step.

  4. Call the subroutine of gmredi_calc_ml.F within gmredi_calc_tensor.F to formulate the eddy diffusivity tensor (e.g., Griffies 1998) associated with the energetically constrained GM variants, which then informs the parameterized eddy fluxes incorporated into the prognostic equation of potential temperature in MITgcm (note that the potential temperature in our simulations is equivalent to the buoyancy).

To ensure numerical stability, all prognostically calculated GM variants are capped at 1000 m2 s−1. In addition, the “slope clipping” tapering scheme of Cox (1987) is enabled with the maximum effective slope set to 0.1.

To incorporate the effect of mesoscale eddy Reynolds stress on the simulated along-slope flow, we further implement the neural network for inferring the EMF following three major steps:

  1. Read in the neural network for inferring the EMF via the subroutine of gmredi_readparms.F, and infer the EMF within the subroutine of gmredi_calc_ml.F (note that this neural network employs identical input quantities as that for inferring the EKE) at each model time step. The ANN-inferred EMF is then stored as a global variable in MITgcm.

  2. Modify the subroutine of calc_eddy_stress.F to calculate the depth-averaged divergence of EMF (EMF divergence hereafter, diagnosed as yuυ¯ in the 3D runs; the negative sign here emphasizes the flux divergence of westward momentum), and call calc_eddy_stress.F within the subroutine of dynamics.F at each model time step.

  3. Within the subroutine of dynamics.F, incorporate the depth-averaged EMF divergence into the prognostic momentum equation of MITgcm as an external along-slope body forcing, realized via the subroutines of time step.F and apply_forcing.F.

Since both the input quantities of the ANN (Table 4) and the predicted eddy Reynolds stress involve calculating the spatial gradients of flow quantities, from which large numerical errors can arise, we set the EMF divergence to be identically zero within two wet grid points to the seafloor. We further replace any extreme value of the EMF divergence (i.e., with parameterized |yuυ¯|>107ms2, which can destabilize the model run) with the value on its nearest shoreward grid point at each time step. To impose the global zonal momentum constraint (Bretherton 1966; Wardle and Marshall 2000; Marshall et al. 2012), we subtract the domain-average of the EMF divergence from the prognostic momentum equation at each grid point and each time step (see also Eden 2010). Additional tests revealed that removing this global momentum constraint did not destabilize or qualitatively alter the solutions of our 2D simulations in the reference experiment (not shown).

We stress that both the eddy momentum forcing and eddy buoyancy fluxes parameterized in our 2D runs are set to be depth-independent for simulations presented in this work, which agrees with previous studies of eddy parameterizations (e.g., Mak et al. 2018; Bachman 2019). Concerns about parameterizing vertically varying eddy stresses across upwelling slope fronts arise from (i) the finding that eddy buoyancy fluxes can be dominantly upgradient near the seafloor (Wang and Stewart 2020), which, upon implementation into a noneddying simulation, would cause numerical instabilities, and (ii) the lack of a theoretical basis upon which the vertical structures of mesoscale eddy stresses over sloping topography can be prescribed. In section 6, we further discuss the efficacy of existing approaches to parameterizing vertically varying eddy transports (Danabasoglu and Marshall 2007; Ferrari et al. 2010) in our experiments.

4. Results

a. Joint implementations of the GM closure and ANN-based EMF

Prior to quantifying the impact of the selected GM variants on the simulated flow, we briefly discuss the necessity for jointly parameterizing the cross-slope eddy buoyancy and momentum fluxes in our 2D simulations.

Figures 3a–c present the mean along-slope velocity u¯ produced by the 3D simulation, a 2D simulation implemented with a constant diffusivity of Kconst70=70m2s1 but no EMF, and a 2D simulation incorporating both Kconst70=70m2s1 and parameterized EMF with the reference model setup (superscripts of Kconst denote the specific values of eddy buoyancy diffusivity). This simple test case reveals the role played by the parameterized EMF divergence in shaping the full-depth along-slope velocity, characterized by an offshore displacement of the westward flow (cf. Figs. 3b,c). Consequently, the near-bottom flow over the steep slope diminishes and attains a minimum westward velocity of ∼0.03 m s−1 in Fig. 3c, a state close to the emergence of the near-bottom eastward undercurrent shown in Fig. 3a. By contrast, the 2D simulation lacking parameterized EMF produces a relatively strong westward flow near the sloping bottom, with its velocity magnitude exceeding 0.1 m s−1.

Fig. 3.
Fig. 3.

Mean alongshore velocity as functions of depth and offshore distance in the reference experiment simulated by (a) the 3D eddy-resolving simulation, (b) the 2D noneddying simulation implemented with Kconst70 but no EMF, and (c) the noneddying simulation implemented with Kconst70 and ANN-inferred EMF. Solid gray contours indicate selected isopleths of u¯[0,0.5]ms1 with an interval of 0.1 m s−1 (the isopleth of u¯=0ms1 is highlighted by the bold contour). The northern sponge layer is shadowed. (d)–(f) The cross-slope profiles of the surface along-slope velocity u¯surf (blue curves), depth-averaged alongshore velocity u¯zavg (black curves), and the bottom alongshore velocity u¯bot (yellow curves) corresponding to the flow fields shown in (a)–(c), respectively. Dashed black lines indicate the latitudes, y = 150 and 250 km, at which the shelf/slope and slope/open ocean are delineated.

Citation: Journal of Physical Oceanography 53, 12; 10.1175/JPO-D-23-0017.1

Figures 3d–f present the surface velocity u¯surf=u¯(z=0) (blue curve), bottom velocity u¯bot=u¯(z=|H|)(yellow curve), and depth-averaged velocity u¯zavg=u¯ (black curve) as functions of offshore distance in these simulations. In both the 2D run incorporating EMF and the 3D run, the maxima of |u¯surf| (highlighted by red circles) are located in the open ocean region (y ∈ [284, 292] km), reaching 0.47 and 0.52 m s−1, respectively; while |u¯surf| peaks at 0.56 m s−1 over the continental slope (y = 243 km) in the 2D run without EMF. In the absence of parameterized EMF, a 2D simulation produces nearly meridionally symmetric profiles of u¯bot and u¯zavg centered at the midslope location (Fig. 3b), resembling the wind stress profile defined by Eq. (2). However, by parameterizing the eddy momentum forcing, the 2D run produces two barotropic jets, as manifested by the number of local maxima of u¯bot and u¯zavg, similar to the jet flow structures in the 3D run (cf. Figs. 3d and 3f). Analogous results for cases of the slope-aware GM variants KMLTslope and KGEOMslope are documented in appendix B.

To maintain a comparable barotropic flow field to that in the eddy-resolving model, the parameterization of EMF is included by default in subsequent analyses, the focus of which is on the predictive skill of selected GM variants. We note in passing that adding the parameterized EMF (even with artificially tuned magnitudes) but turning off the GM closure made our 2D simulations numerically unstable (not shown). In section 4d, we discuss in detail the impact of the parameterized EMF on the flow baroclinicity.

b. Impact of GM variants on the simulated flow

The flow baroclinicity resulting from the GM variants listed in Table 3 is highlighted in Fig. 4, which shows selected isopycnals upon flow equilibria in the 2D (red dashed lines) and 3D (black solid lines) reference simulations.

Fig. 4.
Fig. 4.

Selected mean isotherms (starting from 1°C with an interval of 2°C) in the reference experiment simulated by the 3D eddy-resolving simulation (black solid contours) and 2D simulations (red dashed contours) forced by (a) Kconst70, (b) Kvisb, (c) KMLT, (d) KGEOM, (e) KMLTslope, and (f) KGEOMslope. Red shades in (c) and (d) indicate the range of isopycnal depths resulting from the parameterized eddy diffusivity with the upper and lower bounds of constant prefactors defined in section 3a. ANN-inferred EMF has been enabled in all noneddying simulations. The northern sponge layer is shadowed.

Citation: Journal of Physical Oceanography 53, 12; 10.1175/JPO-D-23-0017.1

For simulations incorporating Kconst70 (Fig. 4a) and Kvisb (Fig. 4b), the predicted isopycnals are flatter shoreward but steeper seaward of the latitude y ≃ 320 km than those in the 3D run. In parallel, in simulations implemented with KMLT (Fig. 4c) and KGEOM (Fig. 4d) the isopycnals are overly flattened across the continental slope and open ocean regions (red shades indicate the range of isopycnal depths resulting from the parameterized diffusivity with the upper and lower bounds of constant prefactors defined in section 3a). Last, the simulations prescribed with the slope-aware forms of the mixing length-based and GEOMETRIC GM variants, KMLTslope (Fig. 4e) and KGEOMslope (Fig. 4f), can most accurately predict the flow stratification, as manifested by the approximate collocations of isopycnals in the 2D run to those in the 3D run across the model domain.

To quantitatively measure the impact of selected GM variants on the simulated flow stratification, we define the bulk relative error of a mean flow quantity,
Err(¯){|¯2D¯3D|}{|¯3D|},
where {} denotes a volume-mean operator across the model domain out of the sponge layer (i.e., y ∈ [0, 450] km), and the subscript 2D (3D) represents the quantity in the 2D (3D) simulation.

The calculated bulk relative errors of mean potential temperature in the reference experiment with different GM variants are plotted using gray bars in Fig. 5. Simulations incorporating KMLT and KGEOM (superscripts of KMLT and KGEOM denote the corresponding values of prefactors) produce large relative errors, ranging between 34.5% and 48.5%, consistent with the overly flattened isopycnals shown in Figs. 4c and 4d. For simulations integrating Kconst70 and Kvisb into the GM scheme, the relative errors decrease to 11.3% and 16.7%, respectively. By integrating topographic modifications, the GM variants KMLTslope and KGEOMslope enable a 2D simulation to accurately predict the domain-wide heat content relative to the 3D run, with Err(θ¯) dropping below 7.0%. In addition to the above calculations, Fig. 5 also shows that a constant eddy buoyancy diffusivity deviating further from 70 m2 s−1 produces a greater relative error of θ¯ in the reference experiment.

Fig. 5.
Fig. 5.

Bulk relative errors of mean potential temperature θ¯ produced by 2D noneddying simulations forced by selected GM variants (see labels on the abscissa) against that produced by the 3D eddy-resolving simulations in the reference experiment. ANN-inferred EMF has been enabled in all noneddying simulations.

Citation: Journal of Physical Oceanography 53, 12; 10.1175/JPO-D-23-0017.1

To examine the skill of selected GM variants on the simulated flow across a broad range of physical parameter space, we plot in Figs. 6a and 6b the domain averages (across y ∈ [0, 450] km) of potential temperature {θ¯} and baroclinic velocity {u¯BC}{u¯u¯(z=|H|)}, respectively, for all parameter perturbation experiments listed in Table 2. This is equivalent to quantifying the domain-integrated heat contents and baroclinic along-slope transports for each group of parameter perturbation experiments. Additional comparisons of the bulk relative errors of θ¯ and u¯BC for all parameter perturbation experiments are presented in appendix C.

Fig. 6.
Fig. 6.

The volume-mean (a) potential temperature {θ¯} and (b) baroclinic velocity {u¯BC} produced by the eddy-resolving simulations (black dots) against those by the noneddying simulations forced by Kconst70 (red diamonds), Kvisb (gray squares), KMLT0.33 (orange triangles), KGEOM0.08 (blue triangles), KMLT0.12 (orange circles), KGEOM0.04 (blue circles), KMLTslope (green stars), and KGEOMslope (purple stars) in all parameter perturbation experiments (see labels on the abscissa and Table 2). Gray shading highlights simulations with the reference model setup. ANN-inferred EMF has been enabled in all noneddying simulations.

Citation: Journal of Physical Oceanography 53, 12; 10.1175/JPO-D-23-0017.1

For the reference experiment (highlighted via light gray shades in Fig. 6), both metrics, {θ¯} and {u¯BC}, in the 2D simulations prescribed with Kconst70 (red diamond), Kvisb (gray square), KMLTslope (green star), and KGEOMslope (purple star) resemble those in the 3D run. This results from our tuning of Kconst and Kvisb to match the domain-averaged eddy buoyancy diffusivity diagnosed from the 3D run (cf. section 3a) and agrees with the closeness of bulk relative errors of θ¯ among these simulations shown in Fig. 5. By contrast, without topographic modifications via the corresponding δ functions, KMLT (orange triangle or circle) and KGEOM (blue triangle or circle) lead to overestimated {θ¯} by 0.65°–0.93°C and underestimated {u¯BC} by 0.03–0.04 m s−1 westward, consistent with the overrestratification by both GM variants shown in Figs. 4c and 4d.

Though Kconst70 and Kvisb can reasonably predict the domain-wide flow baroclinicity in the reference experiment, both GM variants fail to generalize to all 2D parameter perturbation experiments. For instance, the domain-averaged potential temperature is overestimated (underestimated) as the surface wind gets weakened (strengthened), indicating a too high (low) magnitude of eddy buoyancy diffusivity and thus further restratification (destratification) of the flow compared to the case of the 3D run. In parallel, as the topographic slope gets steepened or thermal expansion coefficient gets elevated, the domain-averaged potential temperature in the 3D simulations increases ({θ¯} increases from 1.44°C in the 2Ws simulation to 2.39°C in the 0.5Ws simulation, and from 1.84°C in the 0.5αθ simulation to 2.17°C in the 2αθ simulation), but remains nearly unchanged across the 2D simulations that incorporate Kconst70 or Kvisb.

Figure 6b shows the domain-averaged baroclinic velocity in all simulations. For experiments with varied winds or half slope widths but a fixed thermal expansion coefficient, the trends of simulated baroclinic velocity relative to that of the 3D runs agree with the trends of the predicted domain-averaged potential temperature shown in Fig. 6a, controlled directly by the thermal wind relation (i.e., a more stratified flow is warmer and yields weakened westward baroclinic transport). For other experiments, however, the baroclinic velocity is further controlled by the specific values of the thermal expansion coefficient, which dictate the buoyancy itself (i.e., ∂yb = θyθ). The simulated baroclinic velocity thereby shows a distinct trend compared to that of the predicted potential temperature shown in Fig. 6a. Indeed, with Kconst70 or Kvisb serving as the GM closure, the westward baroclinic transports are overestimated in the 2αθ experiment but reasonably estimated in the 0.5αθ and 1.5αθ experiments.

With commonly adopted constant prefactors (e.g., Mak et al. 2018; Jansen et al. 2019), both KGEOM and KMLT lead to overly stratified flows (see also Figs. 4c,d) and thus underestimated westward baroclinic transports across all 2D simulations. By replacing the constant prefactors with the δ functions defined in section 3a, the slope-aware forms of both GM closures, KGEOMslope and KMLTslope, accurately predict the domain-averaged potential temperature (see also Figs. 4e,f) and baroclinic velocity across all 2D simulations.

Several remarks need to be made on the simulation results so far:

  1. The quantitatively similar performance of the Visbeck et al. (1997) scheme to that of the constant diffusivity Kconst70 in the 2D simulations (Figs. 4 and 6) indicates that the cross-slope profiles of Kvisb tend to become uniform in the prognostic calculations. This would disqualify the Visbeck et al. (1997) scheme and other parametrically analogous GM variants (e.g., GM variants that are proportional to the thermal wind velocity; Stone 1972) as useful closures for the cross-slope eddy buoyancy diffusivity.

  2. The raw forms of KMLT (e.g., Jansen et al. 2019) and KGEOM (e.g., Mak et al. 2022) with commonly adopted prefactors are ineffective in predicting the flow baroclinicity across upwelling slope fronts. Yet this may result either from inadequate parametric formulations of both GM variants (i.e., due to the lack of topographic modifications via the δ functions), or from uncertainties associated with the ANN-inferred EKE and EMF in prognostic calculations (e.g., the EKE might have been overestimated by ANN).

  3. Following the preceding point, it is crucial to assess the extent to which the ANN-inferred EKE may maintain physical consistency, given that no numerically or physically motivated constraint has been imposed on the EKE itself (possible deficiencies in the prognostically estimated eddy energy can be partly “amended” by the imposed upper bound of the eddy buoyancy diffusivity).

  4. In parallel, the relative contributions of parameterized EMF and eddy buoyancy fluxes to the slope flow baroclinicity in our 2D simulations remain to be investigated. Such an assessment is particularly relevant to interpreting the efficacy of selected GM variants objectively, since the flow baroclinicity relies both on the eddy buoyancy fluxes and the cross-slope EMF (e.g., Manucharyan and Isachsen 2019) under the transformed Eulerian mean (TEM) framework (Plumb and Ferrari 2005).

We elaborate on these remarks in the following sections.

c. Characteristics of selected GM variants

To further understand the diverging performance of different GM closures presented in the preceding section, we plot in Fig. 7 the cross-slope profiles of tested GM variants (the constant GM variant is fixed at a representative value of 70 m2 s−1, indicated by the gray solid curve; profiles of KMLT and KGEOM formulated with only their lower-bound prefactors are shown) diagnosed directly from our 3D simulation (solid curves) and produced in the 2D simulation (dashed curves) upon flow equilibria for the reference model setup. The discrepancy between the diagnosed and prognostic values of each GM variant is indicated by the shading. The diagnosed eddy buoyancy diffusivity Kb from the 3D run following Eq. (4), with the eddy buoyancy flux calculated as Fb=υb¯, is plotted using black dots as a reference.

Fig. 7.
Fig. 7.

Selected GM variants diagnosed from the 3D eddy-resolving simulations (solid curves) and produced by the 2D noneddying simulations (dashed curves) upon flow equilibria as functions of offshore distance in the reference experiment. Blue curves indicate the cross-slope profiles of Kvisb defined by Eq. (6); orange curves indicate the cross-slope profiles of KMLT defined by Eq. (7) with the prefactor of αmlt = 0.12; green curves indicate the cross-slope profiles of KGEOM defined by Eq. (9) with the prefactor of αgeom = 0.04; purple curves indicate the cross-slope profiles of KMLTslope defined by Eq. (12); brown curves indicate the cross-slope profiles of KGEOMslope defined by Eq. (13). The diagnosed eddy buoyancy diffusivity Kb from the 3D simulation following Eq. (4), with the eddy buoyancy flux calculated as Fb=υb¯, is plotted using black dots as a reference. Dashed black lines indicate the latitudes, y = 150 and 250 km, at which the shelf/slope and slope/open ocean are delineated. The northern sponge layer is shadowed.

Citation: Journal of Physical Oceanography 53, 12; 10.1175/JPO-D-23-0017.1

1) Diagnosed eddy buoyancy diffusivity from the 3D simulation

The diagnosed eddy buoyancy diffusivity Kb drops below 30 m2 s−1 over the continental slope (y ∈ [150, 250] km), reaches a global minimum of 3.3 m2 s−1 near the shelf break (y = 165 km), and grows monotonically from ∼30 m2 s−1 at the slope-ocean junction (y = 250 km) to 1000 m2 s−1 at y = 380 km. Such cross-slope variations in Kb indicate an overall suppression effect of the sloping bottom on the eddy buoyancy fluxes (e.g., Blumsack and Gierasch 1972; Mechoso 1980; Isachsen 2011; Chen and Kamenkovich 2013; Manucharyan and Isachsen 2019). Nevertheless, the diagnosed eddy buoyancy diffusivity over the continental shelf should be viewed with caution [see discussions by Wang and Stewart (2020) and Wei and Wang (2021)], given that the shelf in our model is unrealistically deep and that the stratification of the shelf region deviates from that of realistic shelf seas regulated by tidal and buoyancy forcing (Stewart et al. 2018; Thomsen et al. 2021; Si et al. 2022). We thus focus primarily on analyzing the slope and open ocean regions in this work.

2) Limitation of the Visbeck et al. (1997) scheme

In contrast to the diagnosed eddy buoyancy diffusivity Kb, the diagnosed Visbeck et al. (1997) GM variant (solid blue curve in Fig. 7) exhibits a global maximum of Kvisb=257m2s1 at the shelf break, decays toward the open ocean, and reaches a global minimum of ∼2 m2 s−1 at y = 394 km. This cross-slope profile of Kvisb is controlled solely by the Eady (1949) growth rate b¯y/N following (6), which hinges upon the magnitude of the depth-averaged isopycnal slope and is proportional to the thermal wind velocity. In our 3D reference simulation, isopycnals are most tilted over the upper continental slope and increasingly flattened toward the open ocean (see Fig. 1a and black contours in Fig. 4), leading to an enhanced magnitude of Kvisb over the steep slope compared to that in the relatively flat-bottomed region.

When serving as an eddy closure, the cross-slope variation of Kvisb is substantially diminished (dashed blue curve in Fig. 7). This is because the initially overestimated (underestimated) eddy buoyancy diffusivity by Kvisb over continental slope (in the open ocean) tends to restratify (destratify) the flow there, which in turn yields a reduced (enhanced) magnitude of Kvisb upon flow equilibrium. Thus, the performance of the Visbeck et al. (1997) scheme becomes analogous to that of a constant eddy buoyancy diffusivity, as shown in Figs. 4a, 4b, and 6. These analyses are consistent with the assertion of Isachsen (2011) that a traditional GM closure proportional to the thermal wind velocity (e.g., Stone 1972) is unlikely to properly parameterize the eddy buoyancy fluxes across a sharp slope front.

3) Online characteristics of energetically constrained GM variants

The overall suppression of the cross-slope eddy buoyancy diffusivity across steep slopes can be qualitatively captured by the energetically constrained GM variants, KMLT and KGEOM, whose cross-slope profiles are highlighted by orange and green curves, respectively, in Fig. 7. Yet both GM variants overestimate the eddy buoyancy diffusivity by at least a factor of 5.7 across the continental slope and much of the open ocean (y ≤ 350 km) based on the 3D model solutions.

The overestimation of eddy buoyancy diffusivity by KMLT is particularly exacerbated in the open ocean, where the seafloor gets flattened and the topographic Rhines scale “explodes.” This issue is remedied numerically in the 2D simulations via the prescribed upper bound of eddy buoyancy diffusivity (1000 m2 s−1; see section 3c). For comparison with previous works, we note that an eddy buoyancy diffusivity of ∼1000 m2 s−1 can be obtained for y ≥ 350 km in our reference simulation by replacing the topographic Rhines scale with the Rossby deformation radius and resetting αMLT ≃ 0.6 in (7), an approach resembling that of Eden and Greatbatch (2008).

When serving as eddy closures in the 2D simulations, both KMLT and KGEOM deviate from their counterparts as diagnosed from the 3D simulations. In the reference experiment, the overall magnitude of KMLT decreases prognostically across the continental slope and much of the open ocean (e.g., the value of KMLT over the slope ranges across 129–373 m2 s−1 with an average of 180 m2 s−1 in the 3D run but 92–232 m2 s−1 with an average of 153 m2 s−1 in the 2D run); by contrast, the GEOMETRIC GM variant increases in magnitude when serving as an eddy closure in the 2D run (e.g., the value of KGEOM over the slope ranges across 48–199 m2 s−1 with an average of 113 m2 s−1 in the 3D run but 81–245 m2 s−1 with an average of 183 m2 s−1 in the 2D run).

To understand the adjustments of KMLT and KGEOM in prognostic calculations, we plot in Fig. 8 the logarithm of depth-averaged EKE inferred online by the ANN in the 2D reference simulation (color curves) against that produced by the 3D reference run (black dots). The cross-slope profiles of 〈EKE〉 in the 2D simulation resemble that in the 3D run. However, in the 2D simulation prescribed with KMLT (orange curve) or KGEOM (green curve), the EKE is primarily underestimated across the continental slope and open ocean regions. Following Eq. (7), the magnitude of KMLT must then decrease. By contrast, the magnitude of KGEOM depends both on the eddy energy and the flow stratification, the latter of which is measured by the bulk Eady time scale N2/|b¯y| in Eq. (9). An overall enhancement of KGEOM in the 2D simulation thus indicates that the restratifying effect of the GEOMETRIC GM variant on the predicted flow overwhelms the effect of the underestimated eddy energy.

Fig. 8.
Fig. 8.

(a) The depth-averaged eddy kinetic energy 〈EKE〉 as functions of offshore distance produced by the 3D simulation (black dots) and 2D simulations incorporating KMLT defined by Eq. (7) with the prefactor of αmlt = 0.12 (orange curve), KGEOM defined by Eq. (9) with the prefactor of αgeom = 0.04 (green curve), KMLTslope defined by Eq. (12) (purple curve), and KGEOMslope defined by Eq. (13) (brown curve) in the reference experiment. Dashed black lines indicate the latitudes, y = 150 and 250 km, at which the shelf/slope and slope/open ocean are delineated. The northern sponge layer is shadowed. (b) Scatterplot of the eddy kinetic energy against the bulk isopycnal slope magnitude |Sisoz| averaged across the latitudinal range of y ∈ [150, 350] km of the 3D simulation (black circle) and 2D simulations incorporating KMLT defined by Eq. (7) with prefactors of αmlt = 0.33 (orange triangle) and 0.12 (orange circle), KGEOM defined by Eq. (9) with the prefactors of αgeom = 0.08 (blue triangle) and 0.04 (blue circle), KMLTslope defined by Eq. (12) (green star), and KGEOMslope defined by Eq. (13) (purple star) with the reference model setup.

Citation: Journal of Physical Oceanography 53, 12; 10.1175/JPO-D-23-0017.1

The above results suggest that the ANN-inferred EKE tends to decrease with increasingly stratified slope flow, the latter of which stems from an overestimated diffusivity for parameterizing the cross-slope eddy buoyancy fluxes.

4) On the physical consistencies of ANN-inferred eddy energy and energetically constrained GM closures

As shown in Fig. 7, by incorporating the δ functions, both KMLTslope (purple solid curve) and KGEOMslope (brown solid curve) diagnosed from the 3D run are capable of quantifying the cross-slope eddy buoyancy diffusivity seaward of the shelf break (y ≥ 150 km; see also Fig. 5 of Wang and Stewart 2020). Specifically, the magnitudes of both GM variants grow almost monotonically from ∼10 m2 s−1 at the shelf break to over 200 m2 s−1 in the open ocean (y ≥ 350 km), thus resembling the cross-slope variation of Kb. The model diagnostics drawn from all 3D simulations across the latitudinal range of y ∈ [150, 350] km yield a correlation of 0.71 (0.91) between Kb and KMLTslope (KGEOMslope), similar to the findings of Wang and Stewart (2020).

When serving as eddy closures in the 2D simulation, both KMLTslope and KGEOMslope nearly reproduce their cross-slope profiles as in the 3D run. Specifically, the correlation of KMLTslope (KGEOMslope) in the 2D simulations upon flow equilibria with its counterpart in the 3D runs reaches 0.92 (0.95) across y ∈ [150, 350] km for all parameter perturbation experiments. Because both GM variants depend on the ANN-inferred eddy energy and the flow baroclinicity (i.e., the depth-averaged isopycnal slope embedded in the δ functions and the bulk Eady (1949) time scale for the slope-aware GEOMETRIC scheme), the latter of which is accurately predicted in all 2D simulations with KMLTslope and KGEOMslope (Figs. 46), the online EKE must also have been well approximated via ANN.

Figure 8a shows that the depth-averaged EKE in simulations with KMLTslope and KGEOMslope is enhanced by a factor of up to 5.0 across the continental slope and open ocean regions relative to the case of an overly stratified flow by KMLT or KGEOM. Notice also that the ANN-inferred EKE is now overestimated by a factor of 2.0 at y ≃ 250 km relative to the case of 3D run, but this slight overestimation is adequately compensated by the slope-aware modification to KMLTslope and KGEOMslope via the δ functions.

The results shown in Figs. 4c–f and 8a suggest that the implemented neural network tends to “inject” more EKE into the energetically constrained GM closures as the flow baroclinicity gets strengthened. To verify this, we plot in Fig. 8b the volume-average of the EKE against that of the bulk isopycnal slope magnitude |Sisoz| across the latitudinal range of y ∈ [150, 350] km in the 2D simulations implemented with energetically constrained GM variants (colored scatters) as well as the 3D simulation (black circle) of the reference experiment. A linear relationship between the eddy energy and the bulk isopycnal slope magnitude arises with a correlation of r = 0.99. Similar linear relationships are identified for other parameter perturbation experiments (Fig. S1 in the online supplemental material).

The tendency of ANN to supply more EKE into the GM closure following the steepening of isopycnals agrees with the characteristics of physically based (e.g., Eden and Greatbatch 2008; Marshall and Adcroft 2010; Jansen et al. 2019) and data-driven (e.g., Partee et al. 2022) estimates of the subgrid-scale eddy energy, which tends to grow in regions with strong flow baroclinicity, such as the western boundary currents and the Southern Ocean. The ANN thus captures the fundamental behaviors of baroclinic eddies, which feed upon the potential energy of the background flow.

d. Impact of ANN-inferred EMF on the flow baroclinicity

Thus far we have focused mainly on the characteristics and effects of different GM variants across upwelling slope fronts. However, cross-slope EMF can be as equally important as the cross-slope eddy buoyancy fluxes in shaping the large-scale flow baroclinicity. In this section, we quantify the influence of the ANN-inferred EMF on the slope flow baroclinicity in our parameterized simulations.

Under the TEM framework (Plumb and Ferrari 2005), Manucharyan and Isachsen (2019) derived an approximate relation of ageostrophic circulations at any interior level of z=z across an upwelling slope front:
ψEMF+ψGMψEkman+ψRes,
ψEkman=τρ0f0,
ψEMF=1f0z0yEMFdz,
ψGM=Kbb¯yN2|z=z=KbSiso|z=z,
where ψEkman, ψEMF, ψGM, and ψRes represent the wind-induced Ekman streamfunction, the EMF-induced streamfunction, the eddy streamfunction resulting from the cross-slope eddy buoyancy fluxes, and the residual streamfunction, respectively. In the adiabatic limit, the residual streamfunction can be neglected (Marshall and Radko 2003), yielding an expression for the equilibrated isopycnal slope:
Siso|z=zStheory|z=zτρ0f0Kb|z=zz0yEMFdzf0Kb|z=z,
where Stheory denotes the theoretically estimated isopycnal slope resulting from the balance between wind stress, eddy buoyancy forcing, and eddy momentum forcing upon flow equilibrium. The impact of EMF on the flow baroclinicity is integrated into the second term on the righthand side of Eq. (19), which states that the wind-input momentum is transferred both downward via the transient eddy form stress represented by the GM closure and across the continental slope via the EMF divergence [see also discussions by Wang and Stewart (2018)].

In our prognostic simulations, the ANN-inferred EMF and thus its divergence are coupled with the simulated flow properties, which in turn are coupled with the implemented GM closure. It is therefore difficult to disentangle the effect of the ANN-inferred EMF divergence on the flow baroclinicity from those of other processes in these simulations. However, Eq. (19) allows us to reconstruct the domain-wide equilibrated potential temperature by optionally adjusting the EMF divergence while “freezing” the parameterized eddy buoyancy diffusivity. The frozen eddy buoyancy diffusivity in this case should resemble Kb (e.g., black dots in Fig. 7) to serve as a “correct” GM closure for the equilibrated flow. The thus reconstructed temperature, though being “artificial” by muting all coupled processes in the predicted flow (e.g., any adjustment of the isopycnal slope by EMF should have simultaneously altered the parameterized eddy buoyancy diffusivity), allows us to isolate the influence of eddy momentum forcing on the predicted flow baroclinicity.

The reconstructed potential temperature T in equilibrium is governed by the first-order hyperbolic equation,
yT+StheoryzT=0,
which is solved numerically using the Courant–Isaacson–Rees scheme (Courant et al. 1952) constrained by the prescribed temperature profile at the northern sponge layer and the Neumann boundary conditions at the surface/bottom of model domain. To maintain numerical stability and accuracy, the cross-slope and vertical directions of model domain are rediscretized using 15 000 and 400 uniform grid points, respectively, with the variables constituting Stheory [e.g., the prognostically derived eddy buoyancy diffusivity and EMF; see Eq. (19)] smoothed through a running-mean operator and then interpolated linearly from the 2D MITgcm grid onto the refined new grid.

1) The theoretically reconstructed isopycnals

Figures 9a–c present the reconstructed isopycnals (green solid curves) against those produced in the 2D simulation (red dashed curves) of the reference experiment with the eddy buoyancy diffusivity parameterized by KGEOMslope for three test cases: (i) the EMF divergence exactly follows that in the 2D simulation (Fig. 9a), (ii) the EMF divergence is completely removed (Fig. 9b), and (iii) the EMF divergence extracted from the 2D run is amplified by a factor of 10 (Fig. 9c). To aid in the interpretation of the results, the domain averages of potential temperature resulting from the 3D run, 2D run, and theoretical reconstruction are annotated, and the latitude at which the EMF divergence changes sign is highlighted using gray dashed lines (the westward mean flow is decelerated by EMF shoreward of the gray lines).

Fig. 9.
Fig. 9.

Selected mean isotherms (starting from 1°C with an interval of 2°C) produced by the reference 2D simulation incorporating KGEOMslope and ANN-inferred EMF (red dashed contours), reconstructed theoretically via Eqs. (19) and (20) (green solid contours), and reconstructed numerically by the 2D simulations incorporating KGEOMslope but ANN-inferred EMF divergence that is further adjusted online (blue solid contours). (a) The theoretically reconstructed isotherms build upon the equilibrated eddy buoyancy diffusivity and EMF divergence in the reference 2D run. (b) The theoretically reconstructed isotherms build upon the equilibrated eddy buoyancy diffusivity in the reference 2D run but no EMF divergence, and the numerically reconstructed isotherms result from KGEOMslope but no EMF divergence online. (c) The theoretically reconstructed isotherms build upon the equilibrated eddy buoyancy diffusivity but ten-fold amplified EMF divergence relative to the case of the reference 2D run, and the numerically reconstructed isotherms result from KGEOMslope but tenfold amplified ANN-inferred EMF divergence online. Dashed gray lines indicate the location at which the EMF divergence in the reference 2D run changes sign. Domain averages of potential temperature in the reference 3D run ({θ¯3D}), in the reference 2D run ({θ¯2D}), and from theoretical ({T}) and numerical ({θ¯2DEMF}) reconstructions are annotated. (d) Bulk relative errors of potential temperature from theoretical (blue diamonds) and numerical (orange squares) reconstructions relative to the case of the 2D reference run as functions of the amplification factor of EMF divergence (an amplification factor of 0 indicates that the EMF divergence is excluded; an amplification factor of 1 indicates that the ANN-inferred EMF divergence is not further adjusted online; an amplification factor of n (n > 1) indicates that the ANN-inferred EMF divergence is n-fold amplified).

Citation: Journal of Physical Oceanography 53, 12; 10.1175/JPO-D-23-0017.1