## Abstract

Subgrid-scale (SGS) velocity variations result in gridscale sea surface flux enhancements that must be parameterized in weather and climate models. Traditional parameterizations are deterministic in that they assign a unique value of the SGS velocity flux enhancement to any given configuration of the resolved state. In this study, we assess the statistics of SGS velocity flux enhancement over a range of averaging scales (as a proxy for varying model resolution) through systematic coarse-graining of a convection-permitting atmospheric model simulation over the Indian Ocean and west Pacific warm pool. Conditioning the statistics of the SGS velocity flux enhancement on 1) the fluxes associated with the resolved winds and 2) the precipitation rate, we find that the lack of a separation between “resolved” and “unresolved” scales results in a distribution of flux enhancements for each configuration of the resolved state. That is, the SGS velocity flux enhancement should be represented stochastically rather than deterministically. The spatial and temporal statistics of the SGS velocity flux enhancement are investigated by using basic descriptive statistics and through a fit to an anisotropic space–time covariance structure. Potential spatial inhomogeneities of the statistics of the SGS velocity flux enhancement are investigated through regional analysis, although because of the relatively short duration of the simulation (9 days) distinguishing true inhomogeneity from sampling variability is difficult. Perspectives for the implementation of such a stochastic parameterization in weather and climate models are discussed.

## 1. Introduction

Near-surface winds exert an important control on exchanges of mass, energy, and momentum between the atmosphere and the underlying surface. In weather and climate models, air–sea exchanges are generally expressed as a combination of the concentration difference between the atmosphere and the sea surface and a function of the near-surface wind speed *s* (conventionally at the anemometer height of 10 m):

In Eq. (1), and are respectively the “concentrations” of quantity *X* (in units of *X* per unit mass of air) at the surface and at the anemometer height, is the surface air density, and is a nondimensional function of the wind speed (and potentially other variables such as near-surface stratification). The exchange coefficient depends on wind speed through, for instance, changes in surface roughness, or bubble injection/spray production by breaking surface waves (e.g., Drennan 2006; Edson 2008; Garbe et al. 2014). The overbars in Eq. (1) denote time averaging (typically over windows of ~10 min) separating the turbulent and Reynolds-averaged variations. Although based on theoretical foundations, these parameterizations are generally largely empirical. Furthermore, although they are averaged in time, the expressions relate fluxes at a single point in space to the atmospheric state (and specifically the wind speed) at that location.

Numerical weather and climate models have finite spatial resolution, and require surface fluxes averaged over model grid boxes. Through the dependence of on , the bulk flux parameterization is generally a nonlinear function of wind speed. Thus, the flux averaged over a region of space (such as a grid box) does not equal the flux that would be computed from the averaged wind speed. Furthermore, the gridbox-averaged wind speed itself is not available from the weather or climate model. Rather, the models directly simulate the gridbox averages of the horizontal wind components. Denoting spatial averaging by angle brackets and the wind vector field by we have

The inequality in Eq. (2), which follows mathematically from Jensen’s inequality and the fact that wind speed is a convex function of the wind components, results physically from the existence of subgrid-scale (SGS) velocity variations and is generally most important under conditions of weak mean winds. A similar inequality holds for the time averaging used to separate the turbulent and Reynolds-averaged parts of the flow (e.g., Beljaars 1995; Mahrt and Sun 1995).

For many fluxes (e.g., momentum, gases, and particles), the function that sets the dependency of flux on wind speed, , is found to be convex (). It follows that

where the first inequality follows again from Jensen’s inequality applied to the function *h*, while the second follows from inequality Eq. (2). The spatially and temporally averaged flux [the left-hand quantity in inequality Eq. (3)] is what is desired, while the flux computed from the norm of the space–time mean of the wind vector [the quantity on the right of Eq. (3)] is what is directly available from the resolved state in models.

The fact that space–time–averaged fluxes exceed the fluxes computed from the space–time–averaged wind vector has been recognized for many years, and a number of studies have considered ways of parameterizing this difference (e.g., Godfrey and Beljaars 1991; Mahrt and Sun 1995; Vickers and Esbensen 1998; Redelsperger et al. 2000; Williams 2001; Zeng et al. 2002). A standard approach accounts for the difference between and due to SGS velocity variations through a SGS velocity flux enhancement term:

Standard parameterizations of account for surface flux enhancement due to disorganized near-surface SGS flow associated with shallow and deep convection:

where is the free convective velocity scale determined by the resolved surface buoyancy flux and is the gridbox-averaged precipitation rate. The coefficient and the function have typically been determined empirically from field measurements and cloud-resolving model simulations. Mahrt and Sun (1995) replaced with a term that represented all mesoscale contributions to the area-mean flux (not just those associated with convective precipitation). The observationally based study of Vickers and Esbensen (1998) parameterized from observations taken under fair weather conditions. Redelsperger et al. (2000) and Williams (2001) demonstrated the importance of the contribution of deep convection (represented by precipitation) to observed subgrid-scale velocity variations. Parameterization of the contribution of boundary layer eddies and deep convection to surface flux enhancement was also considered by Zeng et al. (2002), using results from 1 km × 1 km simulations of a cloud-resolving model (CRM) on a 512 km × 512 km domain in the tropical North Atlantic. Studying the dependence of SGS flux enhancements on averaging scale, Mahrt and Sun (1995), Vickers and Esbensen (1998), and Zeng et al. (2002) all found that the corrections become larger for coarser resolutions and proposed power-law expressions for the dependences. In all these studies, deterministic parameterizations of the subgrid-scale velocity flux enhancement were obtained by empirical fits to data. The physical significance of the scatter of these data around the parameterization curves was not addressed.

Another approach to accounting for SGS velocity variations is to explicitly model these through an assumed parametric probability distribution conditioned on resolved scales (e.g., Cakmur et al. 2004; Capps and Zender 2008; Ridley et al. 2013; Zhang et al. 2016). While the parameterizations considered in these studies are probabilistic, they are still deterministic. The probability distributions employed are used to compute statistical moments across the grid box, rather than to generate a sequence of random values.

Deterministic parameterizations of subgrid-scale processes, in which a unique configuration of the resolved variables is associated with a unique value of the parameterized tendency, are only theoretically justified in the presence of a large separation between resolved and unresolved scales. In the absence of such a scale separation, a distribution of parameterized tendencies will be associated with each configuration of the resolved state, and the mathematical form of the parameterization will be stochastic [see the recent review by Berner et al. (2017)]. The data scatter around the curves corresponding to deterministic parameterizations of SGS velocity flux enhancement demonstrates the existence of such stochastic fluctuations [particularly in the CRM-based study of Zeng et al. (2002), in which the deviations clearly cannot be attributed to measurement error]. As is detailed in the review of Berner et al. (2017), the importance of explicitly accounting for stochastic variations around a deterministic parameterization has been demonstrated in a number of studies on weather, seasonal, and climate time scales. In the specific context of air–sea fluxes, Williams (2012) demonstrated that including stochastic flux fluctuations has an effect not just on model variability, but on its mean state (through rectified deepening of the simulated mixed layer). Including stochastic parameterizations into climate models also improves the representation of processes sensitive to air–sea coupling, such as the El Niño–Southern Oscillation (Christensen et al. 2017; Yang et al. 2019), through improving the high-frequency atmospheric response to changes in sea surface temperature.

In this study, we revisit the question of SGS flux enhancement using a 9-day simulation of a convection-permitting (4-km resolution) atmospheric model on a large tropical domain (20°S–20°N, from East Africa to 180°W). By systematically coarse-graining the high-resolution simulation, we are able to analyze the relationship between “true” gridbox-averaged fluxes and the fluxes computed from the gridbox-mean vector wind (the “resolved flux”). We extend previous analyses not only by estimating the deterministic dependence of the true fluxes on resolved variables, but also by modeling the residuals around this empirical fit as a space–time random field. We emphasize the distinction between such a parameterization and the probabilistic but deterministic ones of Cakmur et al. (2004), Capps and Zender (2008), Ridley et al. (2013), and Zhang et al. (2016). The parameterization we develop samples from a random space–time field at each time step: it is explicitly stochastic.

Rather than explicitly develop a parameterization of , we instead consider the difference between the true and resolved fluxes as a random variable conditioned on the resolved flux and the precipitation rate. While this approach is more abstract, it has the benefit of being able to simultaneously account for the differences in resolved and true fluxes due to SGS velocity variations and the nonlinearity of the dependence of the flux on wind speed. Parameterizations constructed in terms of account only for the first of these two issues.

## 2. Model description

Ideally, subgrid-scale wind variability statistics would be measured from observational datasets. However, our analysis requires data of a sufficiently high spatial resolution over a large domain, for which a suitable observational dataset is not available. Instead, we use an existing high-resolution model simulation as our “truth,” produced as part of the U.K. Natural Environment Research Council (NERC) “Cascade” project (Pearson et al. 2010; Love et al. 2011; Holloway et al. 2012). The Cascade project produced convection-permitting, cloud system-resolving simulations with resolutions ranging from 1.5 to 12 km over several large tropical domains using the Met Office’s Unified Model (MetUM).

For this paper, we use the Cascade 4-km resolution tropical Indo-Pacific warm pool integration. This Cascade simulation has proven useful for assessing stochastic parameterization schemes in other coarse-graining studies (Christensen 2019, manuscript submitted to *Quart. J. Roy. Meteor. Soc.*). For full details of the simulation, see Holloway et al. (2012). In summary, the simulation was produced by using the limited-area MetUM version 7.1 (Davies et al. 2005), covering the domain 20°S–20°N, 42°–177°E. The model is semi-Lagrangian and nonhydrostatic. The model has 70 terrain-following hybrid vertical levels, with a variable vertical resolution ranging from tens of meters in the boundary layer to 250 m in the free troposphere, and with the model top at 40 km. The time step was 30 s. Initial conditions were specified from the ECMWF operational analysis. The 4-km simulation formed one of a hierarchy of simulations. First, a 12-km parameterized convection simulation was produced over a domain 1° larger in each direction, with lateral boundary conditions relaxed to the ECMWF operational analysis. The lateral boundary conditions in the 4-km simulation were specified from the 12-km simulation, through a nudged rim of eight model grid points.

The 4-km resolution simulation is “convection permitting.” The Gregory and Rowntree (1990) convection scheme is adapted such that at large convective available potential energy (CAPE) values the convection scheme is effectively turned off, allowing the model’s dynamical equations to represent strong convective events. The convection scheme is active only for weakly unstable situations. The chosen simulation uses Smagorinsky subgrid mixing in the horizontal and vertical dimensions. The simulation begins on 6 April 2009 and spans 10 days, chosen as a case study of an active Madden–Julian oscillation (MJO) event. The data are stored at full resolution in space and once an hour in time. We discard the first day of simulation, because Holloway et al. (2012) demonstrated a strong spinup of the simulation over this period.

Thorough validation of the Cascade simulation has been reported by Holloway et al. (2012, 2013, 2015). The simulation is shown to produce a realistic MJO, including realistic convective organization, MJO strength, and propagation speed (Holloway et al. 2013). This is likely because the model accurately captures fundamental convective processes, including a realistic vertical heating structure (Holloway et al. 2015), realistic generation of eddy available potential energy (Holloway et al. 2013), improved profiles of moist static energy and saturation moist static energy compared to simulations with parameterized convection (Holloway et al. 2012), and a precipitation distribution that is similar to that diagnosed from Tropical Rainfall Measuring Mission (TRMM) observations (Holloway et al. 2012). The model also has a realistic representation of vertical and zonal wind speeds compared with ECMWF operational analysis, although regions of large-scale ascent are less confined than in observations (Holloway et al. 2013).

Figure 1 presents maps of the mean and standard deviation of the wind speed at the base 4 km × 4 km resolution. Large-scale structure in the mean wind speed field across the domain is evident, with a particular contrast between high wind speeds in the equatorward flanks of the subtropical highs in the southern Indian, North Pacific, and South Pacific oceans, and relatively small wind speeds in the equatorial band and northern Indian Ocean. The wind speed standard deviation field displays more localized regions of relatively large values. Maps of the 50th and 95th percentiles of precipitation rate (Fig. 1) also show considerable spatial heterogeneity. In particular, there are large regions of the domain in which the median precipitation rate is 0 mm day^{−1}; a precipitation rate of zero is also the 95th percentile in the Arabian Sea. When interpreting these and subsequent figures, one must remember that the simulation is of quite short duration. We expect that sampling variations will contribute to spatiotemporal variations of statistics.

## 3. Results

In this study, we focus on the effects of spatial averaging on air–sea fluxes computed from bulk formulae [i.e., Eq. (1)]. As such, all fields we consider are assumed to be Reynolds averaged. This assumption is also consistent with the parameterized nature of the model used to produce the simulations that we analyze. For the rest of the study, we will no longer use an overbar to denote time averages.

Rather than focusing on expressions for specific fluxes (such as water vapor, sensible heat, gases, or aerosols), we consider a generic power-law form for the dependence of flux on wind speed. Furthermore, our focus is on subgrid-scale variations in winds, so we will not consider the explicit dependence of fluxes on other state variables (such as the dependence of the exchange coefficient on near-surface stability through the Obukhov length). We therefore take as a simplified nondimensional representation of air–sea flux:

where m s^{−1} is a speed scale. Scaling the wind speed dependence in this way facilitates comparisons of the nondimensional flux *F* for different values of the exponent *n*. Note that for , the flux function is linear in the wind speed and the difference between true and resolved fluxes results only from the difference between and .

The power-law dependence of fluxes on wind speed assumed here is a simplifying approximation. Neglecting the wind speed dependence of the exchange coefficients and the effect of surface currents, atmospheric boundary layer theory predicts values of for heat and water vapor fluxes and for momentum fluxes (e.g., Drennan 2006). When sea-state dependence of exchange coefficients is parameterized in terms of local wind speed, these functional dependencies are changed (and may not be polynomial). A range of empirically based values of *n* have been reported for gases (e.g., Fig. 2.10 of Garbe et al. 2014), with variations depending on factors such as how fluxes are influenced by bubble injection beneath breaking waves. Relatively high values of *n* are often used for aerosol fluxes, which are strongly affected by the production of spray in whitecaps [e.g., the value of 3.41 used for sea salt in Zhang et al. (2016)]. For illustrative purposes, we consider the values *n* = 1, 2, and 3 in this study.

True fluxes averaged over a gridbox domain of area (with *N* in degrees) are then defined as

As the focus of this study is air–sea fluxes, grid boxes at any coarsening scale containing land points are excluded from the analysis. Estimates of the probability density function (pdf) of computed from the raw 4-km resolution model output for *n* = 1, 2, and 3 are presented in Fig. 2 (left column). The flux distributions move to larger values as *n* increases.

The resolved flux [i.e., the flux that would be computed from the gridbox-mean vector wind ] is defined as

For , we know that (with equality holding only if and in the absence of SGS velocity variations). Because the difference between true and resolved fluxes varies over orders of magnitude, our analysis will focus on the log_{10} error process:

The pdfs of are all positively skewed (Fig. 2; this fact is somewhat obscured by the logarithmic scaling). The resolved fluxes are also positively skewed (not shown). Positive skewness results in the mean flux exceeding the most likely value, and provides occasional large magnitude perturbations which are physically realistic and can potentially improve ensemble spread in a forecast setting. It is important that the parameterized error process respects this skewness. In fact, we show in the next section that the distribution of is approximately Gaussian so the difference between true and resolved fluxes is lognormal and therefore positively skewed. The Gaussianity of the log_{10} error process is also of practical importance for generating realizations (particularly in the multivariate setting when the error is considered as a space–time random field). It is interesting to note that other stochastic parameterizations have proposed the use of positively skewed univariate distributions for the stochastic perturbations (Craig and Cohen 2006; Ollinaho et al. 2017).

### a. Whole domain analysis

We first study the log_{10} error process using wind speeds from across the entire domain. Estimates of the pdfs of for , , , , , and and , 2, and 3 are shown in Fig. 2 (center column). For all values of *n*, the distributions of are unimodal such that the most likely value increases with averaging scale *N*: larger averaging scales correspond to larger average differences between true and resolved fluxes. In addition, the values of the log_{10} error generally increase for larger *n*. For the smallest coarsening scales considered, the errors are generally orders of magnitude smaller than the true or resolved fluxes. As the coarsening scale increases, the range of typical error values becomes comparable to the range of typical flux values. In contrast, the distributions of become narrower as *N* increases. Larger averaging areas result in more averaging of SGS fluctuations and a reduction of the standard deviation of , denoted by std. Consistent with the absence of a spectral gap in SGS velocity variations, the most likely error becomes smaller, but the need for stochastic corrections becomes larger as *N* is reduced from coarser to finer resolution.

As a measure of the practical importance of accounting for the difference between and , we consider the probability of the relative error exceeding 10% as a function of the quantiles of resolved flux (Fig. 2, right column). Across all resolutions and flux exponents, the probability of the relative error exceeding this threshold decreases with increasing : relative errors are generally larger for smaller fluxes. The probability of exceeding the 10% relative error threshold also increases with increases of both *N* and *n*. For a resolution of typical of a contemporary general circulation model (GCM), the relative errors in the bottom quartile of fluxes exceed 10% at least 10% of the time for , 35% of the time for , and 60% of the time for .

#### 1) Distribution of conditioned on resolved fluxes

Developing an empirical parameterization of requires conditioning this quantity on resolved variables. We first study the dependence of the log_{10} error process on the resolved flux. Probability distributions of conditioned on for are presented in the left column of Fig. 3. The spreads of these conditional distributions around the conditional means represent variations in the difference between true and resolved fluxes that cannot be accounted for by the resolved flux alone. While the spreads of the conditional distributions are similar for all three values of *n* considered, there are evident differences in the deterministic dependence of on . For , the median of decreases with : the absolute errors are smaller for larger values of the resolved flux. There is little dependence of on resolved flux for , while for , increases with resolved flux (errors are larger for larger fluxes).

These general features of the conditional dependence of on are found for all coarsening scales considered (Fig. 3, central column). Consistent with the behavior of the relative error, the median of increases with coarsening scale: coarser grids result in larger differences between resolved and true fluxes for all values of . In contrast, the spread of the distribution of the log_{10} error process decreases with coarsening scale (Fig. 3, right column): more averaging results in a narrower distribution. While the interquartile range (iqr) of clearly depends on , this dependence is weaker than that of the median log_{10} error (for and ).

To account for the deterministic dependence of on , we construct a polynomial regression model:

From inspection, we determined that a reasonable fit is obtained for cubic regression, namely, . The results are not strongly sensitive to the value selected for *K*; qualitatively similar results were obtained for (not shown). The residual process is that part of the log_{10} error process that cannot be accounted for deterministically by the resolved flux and must be represented stochastically or by further conditioning on other state variables. As we will see in the next section, we can account for some of the variability of by conditioning on precipitation rate. Values of for and are presented in Table 1.

Inspection of the pdfs of conditioned on (Fig. 4, left column) demonstrates that the regression model Eq. (10) has accounted for most of the deterministic dependence of on the resolved flux. This fact is also true for the other coarsening scales considered (not shown). Quantile–quantile plots of against a normal distribution (Fig. 4, center column) demonstrate that while the distribution of the residual process is not exactly Gaussian, deviations from Gaussianity are generally modest. In general, becomes more non-Gaussian with increasing exponent *n*.

By construction, the iqr of conditioned on is the same as that of . Returning to Fig. 3, we can see that for each value of the exponent *n*, changes in *N* affect the overall value of more than the shape of the dependence on . Almost linear behavior in log–log plots of the unconditional against *N* (Fig. 4, right column, inset) implies that the iqr of can be well approximated by a power-law dependence on resolution:

Values of estimated from linear regression of on are presented in Table 2. We note that these parameters depend only weakly on *n* and that this dependence is not systematic. Rescaling according to Eq. (11),

results in the curves of the conditional interquartile range largely collapsing on single curves for each *n* (Fig. 4, right column). Agreement among the rescaled iqr value is generally poorest for smaller values of the resolved flux (possibly due to sampling variability since relatively few data fall in this range). Overall, these results indicate that the resolution dependence of the scale *N* of the explicitly stochastic part of can be well approximated by a power law.

The spatial patterns of the temporal mean and standard deviation of the residuals are shown in Fig. 5 (right column) for and . Spatial structure is evident in both fields, although spatial variations are smoother and less pronounced at coarsening scale (second and fourth rows). The stochastic variability of the field is weaker at coarsening scale , as discussed earlier. Because of the short (9-day) duration of the simulation, we are unable to determine to what extent these structures represent true spatial nonhomogeneity in the residual field and to what extent they result from sampling variability.

The results demonstrate that by using velocity information alone, the log_{10} error can be approximated by a Gaussian random variable with a mean that depends on the resolved flux and a variance that is independent of the resolved flux but that varies as a power law in “resolution” *N*.

#### 2) conditioning the residual process on the precipitation rate

In addition to intrinsic indeterminacy due to the lack of a scale separation in the velocity field, variability of can result from variations in other physically relevant quantities not accounted for in the regression model Eq. (10). Previous studies have shown a relationship between SGS flux enhancement and convective precipitation (e.g., Redelsperger et al. 2000; Williams 2001; Zhang et al. 2016), resulting from disorganized mesoscale surface flows associated with moist convection. The 4-km resolution of the model simulation we are considering is at the edge of being convection permitting. As such, modeled precipitation contains contributions from both resolved and parameterized convection. These resolved and parameterized precipitation fields are available separately, and the relative contribution of both to the total precipitation rate can be determined. Above a threshold precipitation rate of about 0.6 mm day^{−1}, all of the modeled precipitation is associated with resolved processes (not shown). Since the strongest relationship between and precipitation rate *P* is found above this threshold (Fig. 6, left column), in the following calculations we will not distinguish between precipitation derived from resolved or parameterized motions.

The pdf of conditioned on *P* for shows a relatively weak dependence for mm day^{−1} and a steady increase with *P* above this value (Fig. 6, left column). Such a transition indicates a systematic contribution to SGS flux enhancements of disorganized velocity fluctuations associated with deep convection. The transition from relatively weak to strong dependence moves to larger values of *P* and becomes less abrupt for larger coarsening scales. On larger coarsening scales, the sharpness of the transition is smoothed out because the averaging areas will contain regions of larger and smaller precipitation rates. The slope of the dependence of median on *P* for large *P* is about the same for all coarsening scales. The breadth of the conditional distributions systematically decreases with increasing coarsening scale. Again, more averaging results in smaller fluctuations around the mean.

In the same way that we obtained as the residual of a regression of on , we represent the deterministic dependence of on *P* through a regression model on :

The fourth-root transformation of *P* was determined empirically through inspection of the dependence of the median of conditional on *P* (not shown). In the calculation of these regression coefficients, values of mm day^{−1} were neglected (since they represent a point probability mass at this particular value). Plots of the regression model with for and are shown in Fig. 6. The residual is that part of the error process that depends neither on the resolved flux nor on the precipitation rate and that must be represented as explicitly stochastic in the absence of further conditioning. The sequential conditioning of first on and then on *P* is justified by the absence of a strong statistical relationship between resolved flux and precipitation rate (not shown). Values of the coefficients for and are presented in Table 1.

Quantile–quantile plots of against a normal distribution (Fig. 6, center column) show that except for large values of the coarsening scale and exponent the distribution of does not deviate substantially from Gaussian. The deviations that are present are somewhat larger than we found for , perhaps because of the simple form of the regression Eq. (13) does not capture all of the deterministic dependence of on *P*.

As was the case for , the dependence of the unconditional iqr of on coarsening scale *N* can be approximated as a power law:

(Fig. 6, right column, inset). Estimated values of and {obtained from regressing on } are given in Table 2. As was the case with and , the dependence of and on *n* is weak and not systematic. Interquartile ranges of rescaled by this power law,

and conditioned on *P*, collapse to a reasonable approximation on a single curve for each *n* (Fig. 6, right column). While does show systematic dependence on *P*, the variations around a value of 1 are sufficiently small that it is a reasonable first approximation to model this quantity as a constant. Furthermore, this figure clearly shows that the iqr of conditioned on *P* depends only weakly on the value of the exponent *n*. The fact that the values of are smaller than is a result of the reduction of the spread in relative to because of the further conditioning on *P*.

Maps of the time-mean and standard deviation of the residuals are shown in Fig. 5 for and . The statistics of show much less spatial structure than of the ones of , particularly for the standard deviation and at coarser graining-scale (). This fact indicates that much of the spatial structure of is inherited from the precipitation field. The most pronounced features of mean are the negative values in the Indian Ocean, east of Australia, and on the eastern boundary of the domain, and the positive values around the Maritime Continent and the northwestern coast of Australia. Overall, the use of the precipitation field significantly improves the spatial homogeneity of the residuals.

Using data from across the analysis domain, we conclude that the difference between the true and resolved fluxes can be modeled as a lognormal distributed variable, with a median that depends on the value of the resolved flux and the precipitation rate and an iqr that is to a first approximation independent of , *P*, and *n*, and that depends on the coarsening scale through a simple power law.

#### 3) Spatial and temporal correlation structure of and

So far, we have considered only pointwise (marginal) statistics of the error and the residuals and . Since the residuals at nearby times and spatial locations may not be independent, it is appropriate to treat and as space–time random processes. Basic descriptive characterizations of the temporal and spatial structure of these processes are provided by autocorrelation functions in space and time.

Plots of the temporal autocorrelation functions (acf) of for lags of up to 48 h, composited across all points in the model domain, show that on average the memory of the residual process increases with coarsening scale (Fig. 7, upper left). For example, the value of the acf falls below in about 3 h for and 7 h for . On top of the overall decay of correlations, the acf shows a clear diurnal periodicity.

Conditioning on the precipitation rate reduces both the autocorrelation decay time scale and the amplitude of the diurnal cycle in the spatial composite acf of the residuals (Fig. 7, upper right). These changes are consistent with having accounted deterministically for the contribution to SGS velocity variations from organized convective motion associated with precipitation.

Temporal autocorrelation functions at individual spatial locations display considerable variation around the composites shown in the upper panels of Fig. 7. The lower panels of this figure show the and acf composites for and , as well as the interdecile range across all spatial locations. While the spatial spread of the acf of is slightly smaller than that of , both acfs show substantial spatial variations (although the confidence intervals corresponding to a null hypothesis of zero correlation coefficient are broad because of the relatively few degrees of freedom, particularly if the serial dependence of the time series is accounted for). The acf decay length scales increase slightly with *n* (not shown).

Composites of the spatial correlation function of for , and (Fig. 8, upper row) were obtained by averaging the estimated spatial correlation functions centered at a range of different base locations across the domain. The spatial correlation functions are evidently anisotropic, with decay length scales in the zonal that are larger than those in the meridional. This anisotropy and the values of the correlation length scales tend to increase at coarser averaging scales. Similar behavior is seen for the spatial correlation function of (Fig. 8, second row), although the correlation length scales of are smaller than those of . As was the case for the temporal dependence structure, we find that removing the deterministic dependence on precipitation results in a residual field that is more local in space. While spatial correlation scales increase slightly with increases in *n* (not shown), results similar to those shown in Fig. 8 are found for and .

We now consider variations of the spatial correlation function across the domain. For each base point , the spatial autocorrelation function results in a different map. Since a complete characterization of the spatial correlation structures of and is therefore not practical, we adopt the following approach. For , the spatial correlation field across the entire domain is computed at each of a set of base points on a coarse grid. Around each base point, a contour is drawn corresponding to a squared correlation value of 0.5 for the spatial correlation field with that base point. Within such a contour around any base point, the squared spatial correlation values are larger than 0.5. The resulting maps (Fig. 8, third and fourth rows) give some evidence of variations of the spatial correlation functions of and across the domain. In particular, regions of relatively large correlation length scales for are found in the Arabian Sea and Bay of Bengal, as well as in a band extending from the Horn of Africa to west of Australia. Similar features are seen in the spatial correlation structure of , although the variations across the domain are less pronounced, and no atypical structure is seen in the Bay of Bengal. Broadly similar behavior is found for different coarsening scales *N* and flux exponents *n* (not shown).

From the perspective of developing stochastic parameterizations of SGS flux enhancements, in the following section we propose a statistical model that embeds the pointwise and space–time characteristics of presented above. This statistical framework provides a complementary quantification of features described above (spatiotemporal dynamics and marginal distributions) and also allows generation of realistic space–time samples of the SGS flux enhancement. A Gaussian process is used here to model the space–time residual processes and . Gaussian processes are tractable stochastic processes in a multidimensional context (space–time in our case) and the choice of Gaussian marginal distribution is supported by Figs. 4 and 6. Since Gaussian processes are characterized by their first and second moments only and the mean of the residuals does not need to be accounted for, we only consider the specification of the space–time covariance structure in the following.

#### 4) Fitting spatiotemporal covariance structures

To quantify the spatiotemporal dynamics and the spatial anisotropy observed in Figs. 7 and 8, as well as the dependence on the coarsening scales, parametric anisotropic spatiotemporal covariance structures have been fit locally for each of the two residual processes and for various coarsening scales *N*.

##### (i) Spatiotemporal covariance model

The ellipsoidal contour lines present in the observed spatial correlation (Fig. 8) suggest the use of an anisotropic correlation model with different dependence in the meridional and zonal directions determined respectively by parameters and . For simplicity, we assume that the semimajor and semiminor axes of the correlation align with the zonal and meridional directions. We also include temporal dependence scale in the correlation structure. The commonly used power exponential correlation is considered with a 3D anisotropic distance for the space–time coordinates, and is fit to the data:

with the space–time 3D distance

The parameters , , , *σ*, *γ*, and *α* are positive real numbers estimated by a least squares method described below, and *x*, *y*, and *t* respectively represent the latitude, longitude, and temporal coordinates. For more flexibility in the correlation decay, the distance exponent is estimated as a parameter of the covariance model. A nugget is added to the covariance to capture local variance that is not accounted for in the parametric exponential part of the model Eq. (16).

##### (ii) Estimation of the local covariance structure

A moving-window framework is used to estimate the spatial variations of the covariance structure (as in Haas 1990; Kuusela and Stein 2018). More specifically, the whole domain is subdivided into smaller regions of size 400 km × 400 km. Within each window, stationarity is assumed, and the proposed covariance model Eq. (16) is fit independently to the residuals and . To ensure continuity, the windows overlap by 40 km.

Figures 9–11, respectively, show maps of the estimated values of the parameters , , ; *γ*; and *δ*. Parameters are depicted for both processes and , and for the two coarsening scales and . Spatially heterogeneous structure is evident in the maps of the estimates of , , and , as expected given the large size of the domain and the limited temporal duration of the simulation. As observed in the empirical correlation structure (Fig. 8), the parameters estimated from exhibit more homogeneity across the domain, shorter spatial length scales, and less anisotropy (similar ranges of values for and ) than those of . Again, we see that the precipitation field explains much of the spatiotemporal structure of the error process . Zonal correlation elongation (larger values of than ) is evident for both coarsening scales considered. This anisotropy tends to be slightly stronger at coarser scales than finer ones. As indicated by the composite spatial and temporal correlation structures (Figs. 7, 8), the spatial and temporal scales , , and are longer for coarser averaging scales. The larger scales of space and time variations of the error process when *N* is large results from the averaging out of smaller scales.

Figure 10 shows estimates of the parameter *γ* that determines the smoothness of the field. This parameter also shows evidence of spatial heterogeneity. The parameter value is larger when the precipitation is not regressed out, which is another indication that the precipitation results in localized spatial structure in the error process , resulting in a less structured and less smooth residual process. In contrast with the other parameters considered, conditioning on precipitation and varying the coarsening scales have less influence on the intensity of this parameter.

In Fig. 11, the ratio of the nugget *δ* and the variance of the error shows that the nugget parameter tends to have slightly less importance in the overall variance when the precipitation is regressed out. In that case, the residual fields present less unexplained information that cannot be captured by the proposed parametric covariance with a single decay scale in each direction.

Some regions of the maps display atypical behaviors, such as the Arabian Sea and the southeastern part of the Indian Ocean, where the correlation structure is not influenced by the precipitation field. These behaviors are expected because precipitation was almost absent in those regions during the simulation time.

##### (iii) Simulating the error process

To assess the quality of the statistical models we have developed for , we generated samples of and from a Gaussian distribution with zero mean and a covariance specified by the estimated version of Eq. (16). The choice of a Gaussian distribution is justified by the quantile–quantile plots shown in Figs. 4 and 6. Samples of the error process were then constructed via Eqs. (10) and (13).

Figure 12 shows sample time series of the “true” error process and its simulated samples at an arbitrary location for both models Eqs. (10) and (13). While both models capture the range of variations of reasonably well, structure is evident in the log_{10} error process that is captured by Eq. (13) but not by Eq. (10). In particular, the large sustained increase in starting on 11 April is captured by the model including precipitation rate as a regressor, but not by the model based only on the resolved flux. The benefit of conditioning on *P* is evident from this result. Consistent with the results of section 3a(2), the spread of the ensemble of simulations around is smaller for the model Eq. (13) than for Eq. (10): including precipitation as a regressor improves the resolution (sharpness) of the ensemble forecast. The statistical consistency between the observed error process and its samples is further explored through rank histograms at a single location in Fig. 12. When a perfect match exists between the distributions of observations and samples, the rank histogram is expected to be uniform. We observe that the use of precipitation in the regression (lower panel) provides a better statistical calibration than does the regression based on the resolved fluxes only (upper panel). The fact that the rank histogram is not flat for either model reflects that the statistical model does not exactly fit the statistics of either residual process.

The simulated time series also capture the true temporal autocorrelation structure of (Fig. 13). The broader range of acf curves for constructed from realizations of than from realizations of is consistent with the latter being more constrained by resolved variables (which are the same among all realizations). We note that the correlation values for the shortest time lags tend to be underestimated by the proposed models.

Figure 14 depicts maps of the mean square error (MSE) between the “true” error process and the simulated samples. The overall magnitude of the MSE is smaller for the model Eq. (13) than for Eq. (10). Moreover, the former model shows a weaker spatial structure due to the use of the precipitation information. The total MSE is decomposed into its squared bias and centered MSE components to assess the respective contributions of the mean features and of the fluctuations of the fields (Taylor 2001). The squared bias contribution is significantly less than the difference in variability, indicating that both proposed models capture the global mean of the error process reasonably well. However, the bias term exhibits more spatial structure than does the centered MSE, indicating that the proposed models capture well the structure of the stochastic variability of the error process. Including the precipitation field as a predictor improves the ability of the statistical model to account for the mean and fluctuations over the domain, particularly accounting for much of the structure in the squared bias term. Including further predictors might be able to reduce the squared bias term further, particularly around the Arabian Sea and the southeast Indian Ocean.

### b. Local domain analysis

Because of the relatively short duration of the simulation we are considering, some of the apparent spatial nonstationarity in the temporal and spatial autocorrelation functions may result from sampling variability. For example, an animation of the surface wind field over the simulation period (not shown) shows the migration of a strong cyclone from the Arabian Sea to the Bay of Bengal; such a circulation feature is not observed to occur elsewhere in the domain in this 9-day period. Nevertheless, the potential for spatially nonstationary structure motivates repeating the analysis of the relationships between , , and *P* in different subregions of the model domain. Furthermore, previous empirical studies of SGS flux enhancement have considered either observations or model simulations in spatial domains much smaller than the one we study. We therefore reexamine our analysis in model subdomains.

To examine regional variations in and , regression Eqs. (10) and (13) were fit separately on three subdomains (the western Pacific, Arabian Sea, southern Indian Ocean) depicted on Fig. 1. Similarities are evident among the statistical properties of the residuals and in the subregions, in terms of marginal distributions (Fig. 15), spatial correlation (Fig. 16), and temporal structure (not shown). For the most part, we find that coarser averaging scales result in larger departures of the residuals from normality. Consistent with the global analysis, spatial correlations at coarser resolutions appear stronger than the ones at finer resolutions.

However, we also note differences in statistical features between these different regions. As previously observed, the Arabian Sea has atypical characteristics, especially in terms of spatial and temporal dynamics. The spatial correlation scales of and are longer than in the other two subdomains. The absence of precipitation in this area during the period of the model simulation (Fig. 1) is likely responsible for this variant behavior.

Given the short temporal amount of data, it is difficult to distinguish sampling variability from true spatial heterogeneity in the fields. However, the very low precipitation rates over large parts of the model domain (lower than long-term climatological values) do indicate that the limited temporal duration of the simulation is an important factor for the spatial structure.

## 4. Discussion and conclusions

In this study, we have considered the empirical parameterization of the subgrid-scale velocity enhancement of spatially-averaged sea surface fluxes in weather and climate models. Using output from a relatively high-resolution, convection-permitting model simulation, we have shown that the SGS flux enhancement is not a deterministic function of the resolved state. Considering a range of different coarsening scales and flux exponents, and regressing the differences between the true and resolved fluxes on (nonlinearly transformed) resolved flux and precipitation rates, we have obtained residual fields characterizing the essentially stochastic nature of the SGS flux enhancement. The final model that we propose takes the lognormal form

with

where is a Gaussian space–time field with a variance that scales as a power law of the coarsening scale *N*. The residual field has been shown to be correlated in space and time, such that increases in *N* result in increases of both the spatial and temporal correlation decay scales. Modeling the statistics of as a function of coarsening scale *N* is an important step in allowing this parameterization to be scale aware.

Space–time Gaussian process models have been fit through the estimation of parametric covariances. To account for potential spatial inhomogeneity, covariances were fit in a set of overlapping moving windows. This estimation provides insights into the space–time characteristics of the residual fields: we were able to better quantify the spatial and temporal correlation ranges across coarsening scales and across the domain, and to assess the spatial anisotropy of the fields. Furthermore, this framework provides a space–time sampling distribution that could be used in future implementations.

In this study we have treated a 4-km simulation as “truth,” since observational data do not exist at a high-enough resolution over such a large spatiotemporal domain. Because of the realism of the simulation (Holloway et al. 2012, 2013, 2015), the results of this study are a good first indication of the statistics of subgrid scale fluxes. Furthermore, the relatively large precipitation rates which have the strongest deterministic relationship with the error process (Fig. 6) are associated with resolved dynamics rather than parameterized convection. Nevertheless, details of the proposed model, such as the precise values of the regression coefficients, could change if a different model simulation were coarse grained. A further limitation of the study is the restricted spatial domain and length of the simulation: the statistics of SGS fluxes could vary depending on region of the globe and meteorological conditions. A follow up study is planned which will apply these techniques to a different dataset that covers a larger space–time domain to assess the generality of the parameterization.

Because the 4-km resolution of the model is still relatively coarse and the model equations are Reynolds averaged, this analysis does not account for those contributions to SGS velocity flux enhancement that are associated with the model’s existing gustiness parameterization (Walters et al. 2017). Since the main goal of this analysis is to demonstrate the importance of explicitly accounting for the stochasticity of the parameterization, the fact that not all SGS velocity variations are accounted for is not a critical limitation. We expect that if output from higher-resolution observations or model output were used, the magnitude of stochastic fluctuations around the deterministic parameterization would increase.

To construct an empirical parameterization of SGS flux enhancements, we have used the resolved flux and precipitation rate as deterministic predictors of the error process . It is possible that may depend on other modeled quantities and that by including these in the regression model we would further reduce the stochasticity of our parameterization. For example, the dependence of the exchange coefficient on sea surface temperature (through, e.g., changes in near-surface stability) has been neglected. Furthermore, the dependence of the error process on resolved variables may depend on the specific parameterization schemes used in the model. Further investigation of these questions is an interesting direction of future study.

Following standard practice (e.g., Williams 2001), we have neglected the dependence between variations in air density, wind speed, and air–sea concentration difference that can affect area-averaged fluxes [Eq. (1)]. Furthermore, our parameterization is based on the general resolved flux rather than specifically the surface heat flux (through the free convective scale) as in standard gustiness parameterizations (e.g., Beljaars 1995; Mahrt and Sun 1995; Williams 2001). While our approach has the advantage of not requiring an iterative calculation of fluxes, it is further removed from the basic boundary layer physics used in justifying expressions such as Eq. (5). Moreover, many choices regarding the structure of the statistical model (such as the fourth-root transformation of precipitation rate, and the number of terms *K* and *L* in the resolved flux and precipitation rate regressions) were determined through experimentation rather than systematic optimization. A more systematic and objective approach to optimizing the values of these quantities should be considered in future research. Similarly, the consideration of alternative formulations of the statistical model Eq. (18), in terms of both the predictor fields chosen and the model architecture, is an interesting direction of future study. The development of physically based parameterizations (such as that of Williams 2001) rather than empirically based ones is also a potentially important direction of research. Finally, repeating this analysis with longer time series on a larger spatial domain would allow a better determination of spatial and temporal heterogeneities in the statistics of SGS flux enhancements.

The goal of this study has been to demonstrate (via a systematic coarse-graining analysis) the fundamentally stochastic nature of the dependence of area-averaged fluxes on the resolved state and to characterize the structure of the stochastic space–time fields needed to parameterize this dependence. This analysis demonstrated the existence of spatial and temporal dependence in the stochastic parameterization and provided empirical evidence for the inclusion of such correlations in stochastic parameterization schemes (as opposed to treating this structure as a pragmatic solution to improve ensemble spread; see, e.g., Leutbecher et al. 2017). This analysis also highlighted the resolution dependency of such spatiotemporal correlations, which is not currently included in operational stochastic schemes. A future study will report on the result of implementing and testing such a stochastic sea surface flux parameterization in weather and climate models.

## Acknowledgments

Data from the Cascade project is available on request from the NERC Centre for Environmental Data Analysis (CEDA). This research started in a working group supported by the Statistical and Mathematical Sciences Institute (SAMSI). AHM acknowledges support from the Natural Sciences and Engineering Research Council of Canada (NSERC), and thanks SAMSI for hosting him in the autumn of 2017. The effort of Julie Bessac is based in part on work supported by the U.S. Department of Energy, Office of Science, under Contract DE-AC02-06CH11357. The research of HMC was supported by NERC Grant NE/P018238/1. We thank Aneesh Subramanian and two anonymous reviewers for their helpful comments. Data and analysis scripts are available from the authors on request.

## REFERENCES

*Atmosphere–Ocean Interactions*, Vol. 2, W. Perrie, Ed., WIT Press, 1–33.

*ECMWF Workshop on Ocean-Atmosphere Interactions*, Reading, United Kingdom, ECMWF, 7–24, https://www.ecmwf.int/node/9260.

*Ocean-Atmosphere Interactions of Gases and Particles*, P. Liss and M. Johnson, Eds., Springer, 55–112, https://doi.org/10.1007/978-3-642-25643-1_2.

*Geosci. Model Dev. Discuss.*, https://doi.org/10.5194/gmd-2017-291.

*Climate Dyn.*, https://doi.org/10.1007/s00382-019-04660-0.

## Footnotes

© 2019 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).