1. Introduction
The use of machine learning in the atmospheric sciences has been considered for a long time (Gardner and Dorling 1998) but has only recently started being used in practice (Irrgang et al. 2021; Chantry et al. 2021a; Dueben et al. 2022). Causal analysis has been used to try to understand cause and effect in teleconnection pathways (Kretschmer et al. 2021), and for climate model evaluation (Nowack et al. 2020). Machine learning algorithms have been used to predict modes of climate variability (Ham et al. 2019; Martin et al. 2022) and to replicate and improve upon existing parameterization schemes in climate models (Matsuoka et al. 2020; Meyer et al. 2022b,a; Wu et al. 2022). In this study, we are concerned with the latter application, focusing in particular on the parameterization of nonorographic gravity waves.
Gravity waves are small-scale waves in the atmosphere that have large cumulative effects on global climate and its variability. Although some are resolved, most are beyond the resolution of current global climate models, and so their effects are approximated using physics-based parameterization schemes. In the Met Office climate model, nonorographic gravity waves are represented by a spectral subgrid parameterization scheme (Warner and McIntyre 1999; Scaife et al. 2000, 2002). This scheme represents wave generation by sources such as convection through prescribing a spatially uniform momentum conserving spectrum of waves at an ∼4-km launch level. It also represents conservative wave propagation in the vertical, and wave dissipation by critical-level filtering and saturation (Warner and McIntyre 2001). Inclusion of the scheme in climate models alleviates the so-called cold-pole problem whereby the stratospheric polar night jets were biased strong (Garcia et al. 2017). It also enables the model to internally generate a quasi-biennial oscillation (QBO) of the tropical stratospheric zonal winds (Scaife et al. 2000), a leading source of variability in the tropical stratosphere and an important source of skill within seasonal predictions of the troposphere (O’Reilly et al. 2019; Scaife et al. 2014).
While they generally perform well, and are essential for climate model simulations, all parameterization schemes involve approximations that may affect model accuracy. In the case of nonorographic gravity waves, the scheme is column based, with waves propagating vertically but not laterally as they do in nature (Alexander et al. 2010). Also, the gravity wave source term is idealized and, therefore, does not represent the variability in observed gravity wave sources. Even the more realistic schemes, which parameterize the gravity wave source term from convection using precipitation (Bushell et al. 2015), still contain some approximations. The aim of this study is to use a machine learning algorithm to emulate the behavior of the existing climate model parameterization scheme. In the future, such an algorithm could potentially be trained on observational or high-resolution model data, to include the effects of lateral propagation and realistic gravity wave source terms, and be used to improve on the existing parameterization scheme. In addition, this is one of the first examples of a machine learning algorithm being coupled to the Met Office climate model, and it provides a framework and method upon which to build further applications of machine learning into this model.
It is a prerequisite of any machine learning algorithm intended to replace an existing nonorographic gravity wave parameterization scheme that the algorithm must still enable the climate model to internally generate a QBO. The use of neural networks to emulate nonorographic gravity waves and thereby enable the simulation of a QBO in global atmospheric models has been achieved in recent studies (Chantry et al. 2021b; Espinosa et al. 2022). One difference, described in section 2, is that this study uses a dilated convolutional neural network, as compared with the fully connected neural network used in Chantry et al. (2021b). In addition, we demonstrate the use of a simple one-dimensional model framework to choose the network hyperparameters based on emergent properties of the simulated QBO, prior to coupling the neural network to a climate model. Importantly, we also demonstrate that the network, coupled to the Met Office climate model, is capable of simulating the variability in nonorographic gravity waves associated with El Niño–Southern Oscillation (ENSO) and with stratospheric polar vortex strength, with an accuracy comparable to that achieved by the existing parameterization scheme. This provides evidence for the applicability of neural networks to climate and seasonal forecasting.
Details of the machine learning algorithm, and its coupling to the one-dimensional and climate models, are given in section 2. Results are presented in section 3, and conclusions are drawn in section 4.
2. Data and methods
a. Machine learning algorithm
Generating a zonal nonorographic gravity wave tendency from input dynamical fields falls under the category of “supervised learning” (Russell and Norvig 2010). Common choices of machine learning algorithm for supervised learning are random forests, and various types of neural network. We have tested a random forest, fully connected neural network, dense convolutional neural network, and dilated convolutional neural network, and simple linear regression. To determine which algorithm performs best, one month of daily zonal wind (U) data, and the equivalent nonorographic gravity wave tendencies produced by the current parameterization scheme, from an existing HadGEM3 (see section 2c for model details) simulation, were used as the input and target data, respectively. Use of U data as the single input variable for these tests is due to the strong dependence of the zonal nonorographic gravity wave tendencies on this input field (see also Espinosa et al. 2022). A more detailed analysis of the required input data is performed with the tuned dilated convolutional neural network (see section 3a) but was not repeated for these alternative algorithms. The data at each latitude/longitude point were treated as independent, with the algorithm taking vertical columns of data, equal in size to the number of HadGEM3 model levels as would be the case for the existing parameterization scheme. In comparing the output of these algorithms with the target nonorographic gravity wave data, a dilated convolutional neural network was found to outperform the other algorithms, for the tests performed here and with the input data used here, with the fully connected neural network performing second best. We repeated these tests for the best two algorithms, the fully connected neural network and dilated convolutional neural network, using 2 years of U data. We found, consistent with the result above, that the dilated convolutional neural network had the lowest mean error, around 12.5% lower than for a fully connected neural network in this case.
When compared with fully connected neural networks, convolutional neural networks (Goodfellow et al. 2016) more efficiently identify and extract the key features from the input data and have fewer parameters to train—in other words, they are able to more efficiently capture relationships and connections between data points. In addition, dilated convolutional neural networks skip data points, sampling for example one-in-two or one-in-three data points, and so are able to cover a larger range of the input column than a dense convolution for a given depth of network. These features may be why we find a dilated convolutional neural network to perform best. However, a detailed investigation is beyond the scope of the present study, and we simply state a dilated convolutional neural network performs best in our case for the limited testing we performed. In what follows, only a dilated convolutional neural network is used, constructed using the Conv1D package provided within Pytorch (https://pytorch.org/docs/stable/generated/torch.nn.Conv1d.html).
To ascertain the required amount of training data, the neural network is trained on randomly chosen input samples, with the vast majority of the decrease in mean absolute error (MAE) found when using around 300 000–400 000 samples, and a far more gradual decrease in MAE when using more random samples (see the appendix). We find that 2 years of climate model data, containing significantly more than 400 000 random samples, are sufficient to train the neural network. To train and test the neural network, a 10-yr climate simulation (1988–97) is run to generate the required input and target datasets. For the training dataset, we chose 2.5 years of climate model data (July 1995–January 1998) so as to span a full period of the QBO (and capture variability due to ENSO and the Northern Hemisphere winter stratospheric polar vortex as detailed further in section 3b). For an independent testing dataset, the two years 1992 and 1993 were chosen since these years are sufficiently far from 1995 to 1998 to be independent, and they also span nearly a full QBO period and capture the variability in the Northern Hemisphere winter stratospheric polar vortex. This is comparable to previous studies (Chantry et al. 2021b; Espinosa et al. 2022), and we choose not to utilize the entire 10 years of climate model data for training/testing to save on computational expense. Before training the neural networks, all input and output variables are normalized separately (e.g., when using multiple input variables; see section 3a), to have a mean value of 0 and a standard deviation of 1. When coupling the neural networks to the 1D and the climate models (see sections 2b and 2c), the output gravity wave tendencies are denormalized before use in the physical models. As mentioned above, a convolutional neural network extracts the features from the input data, and with an increased number of layers the architecture adapts to high level features. We find that the MAE starts to converge when using ∼six layers or more and so, for computational cost reasons, use eight layers. Columns of data are padded with zeros in convolutional layers (equivalent to padding with average data values since normalized input data are used). Full details of the final dilated convolutional neural network used in this study are given in Table 1.
Details of dilated convolutional neural network.
b. Coupling to a 1D atmospheric model
The coupling is straightforward, with the neural network taking the output zonal wind from the 1D model to compute the nonorographic gravity wave tendency used by the 1D model at the next time step. When coupling this 1D model to an offline version of the Met Office nonorographic gravity wave parameterization scheme (see section 3a), climatological values computed using one month of Met Office climate model data are used for the additional required inputs of meridional wind, density, pressure, and buoyancy frequency. Zonal mean data is used for the purpose of these 1D tests.
c. Coupling to the Met Office climate model
Once the neural network is producing good results when coupled to the 1D model, the next step is to couple to the Met Office climate model. The version of the climate model used here is an atmosphere only HadGEM3-GA8.0 configuration (Walters et al. 2019). The atmosphere and land surface model horizontal resolution is 0.833° longitude × 0.556° latitude (N216). There are 85 vertical levels extending from the ground to 85 km, consisting of 50 tropospheric levels and 35 stratospheric levels, with approximately 700-m vertical resolution around the tropopause.
Coupling the neural network to the climate model is achieved as follows. The trained neural network, as written in PyTorch, is exported to TorchScript format. This can then be loaded and run with C++ code that uses functionality from the libtorch C++ library. It is ensured that the output from the resulting C++ executable matches that from running the PyTorch code in a Python script. Fortran bindings are then written to integrate this C++ executable into the climate model, and a switch is included such that nonorographic gravity wave tendencies can be included in model simulations using either the existing online parameterization scheme or the neural network. Unlike the parameterization scheme, the neural network produces nonzero tendencies at all model levels, but tendencies below the parameterization scheme’s 4 km launch level are not used, that is, they are not added to the zonal wind. We accept that this is a small inefficiency in the neural network, but impacts of output below the launch level on the neural network loss function (see section 3a), and thus the outputs of the neural network, are small. The neural network is only coupled in the zonal direction. Calculation of the meridional component of the nonorographic gravity wave tendencies would effectively require training an additional neural network (Espinosa et al. 2022) as the zonal and meridional components of the tendencies are independent of each other [Eq. (18) of Warner and McIntyre 2001]. Thus, when the online parameterization scheme is replaced by the neural network in the climate model, the meridional component of the nonorographic gravity wave tendency is set to zero in model simulations. The diagnostics considered in this work all depend on the zonal component of the tendencies and are not affected by the meridional component being set to zero.
3. Results
a. Developing the neural network
The number of input variables used by the network has a large impact on its computational efficiency. This is because, when adding additional input variables, we scale the number of convolutional filters within the network linearly with the number of input parameters to ensure that there is ample representational capacity within the network and the experiment is not restricted by compressing the higher number of input features into the same number of feature maps. The nonorographic parameterization scheme requires six input variables: zonal wind U, meridional wind V, pressure P, density RHO, potential temperature THETA, and buoyancy frequency NBV. As mentioned above, U is correlated most strongly to the tendencies, and provides much of the information required by the network. Since the target here is zonal tendencies, there is no dependence on V (Warner and McIntyre 2001). There is some redundancy in the remaining four input variables, and sensitivity tests were performed to see which input variables are required to minimize the MAE of the neural network. These tests used the 2.5-yr training dataset and 2-yr testing dataset described in section 2a. It was found that
-
the MAE is similar whether U or dU/dz (denoted DU) is used,
-
adding RHO as a second input variable slightly reduces the MAE,
-
adding NBV as a second input variable greatly reduces the MAE, particularly in the upper atmosphere, and
-
adding a third input variable offers no significant further reduction in the MAE.
These results are summarized in Fig. 1, which shows the vertically integrated MAE, and the MAE as a function of height. The importance of buoyancy frequency, in addition to zonal wind, might have been expected since buoyancy frequency is explicitly included in the dispersion relation for the parameterized gravity waves, and so strongly determines how the waves propagate and therefore saturate (Warner and McIntyre 2001). This appears to be particularly important in the upper atmosphere. Finding number 4 above is particularly interesting because it shows that only two of the six inputs to the parameterization scheme are required. This is also largely expected, since the neural network is supplied with a vertical column of data, and so has some information regarding model level number, and it suggests the network is behaving physically. On the other hand, it also suggests a potential efficiency to be gained by using a machine learning algorithm, or even within the parameterization scheme itself.
MAE between target data and neural network output gravity wave tendencies (m s−1 day−1), (a) averaged over all model levels and (b) as a function of model level height, for neural networks using different input data. Input data are zonal wind U, vertical gradient in zonal wind DU, buoyancy frequency NBV, and air density RHO. Larger errors are found at higher altitudes, where the absolute magnitude of the gravity wave tendency is larger in magnitude.
Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-22-0081.1
Therefore, U and NBV are chosen as the input variables for our neural network, which is then coupled to a simple one-dimensional dynamical model simulating zonal mean dynamics within a vertical column on the equator, as described in section 2b. The coupled system is found to simulate a QBO of the stratospheric zonal winds (Fig. 2). Of course, the QBO generated by this simple 1D system will not be structurally as close to the observed QBO as would a QBO generated by a climate model. The important result here is that the QBO generated by the 1D model coupled to the neural network is similar in period, amplitude, and structure to that generated by the same 1D model coupled to an offline version of the climate model’s existing nonorographic gravity wave parameterization scheme. To achieve this similarity, the neural network needed training on a sufficient amount of input data. As detailed in section 2a we use 2.5 years of training data and 2 years of testing data, each covering nearly an entire period of the QBO and, in particular, the transition periods between easterly and westerly phases of the QBO. Use of the emergent properties of the coupled 1D model system was crucial in determining what constitutes a sufficient amount of input data. For example, when initially trained on only one month of input data, the neural network, coupled to this 1D model, generated a QBO with amplitude around 5 times that observed—clearly unphysical and unfit to be coupled to a climate model.
Daily mean equatorial zonal mean (left) U (m s−1) and (right) nonorographic gravity wave tendencies (NOGW; m s−1 day−1), as simulated by a one-dimensional mechanistic model coupled to (a),(b) an offline version of the Met Office climate model nonorographic gravity wave parameterization scheme and (c),(d) a neural network, as described in section 2b.
Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-22-0081.1
Note that the simulated QBO amplitude is a little stronger above 50 km when using the neural network than when using the offline parameterization scheme. However, this height is close to the top boundary of the model, and above the region where the QBO exists in a full climate model. Also, the asymmetry in phases between easterly and westerly periods of the QBO is different for the neural network and offline parameterization scheme systems, but neither is entirely accurate in this regard, and neither is obviously closer to the observations than the other (except, perhaps, the longer westerly phase in the lowermost stratosphere generated when using the offline parameterization scheme). These differences are, therefore, not considered to highlight deficiencies that would be important within the climate model. In addition, the gravity wave tendencies simulated using both the neural network and offline parameterization scheme are significantly greater during transition periods between easterly and westerly phases of the QBO than they are at other times (Fig. 2). This is an emergent physical property captured by the neural network that could not be checked without this coupling to a dynamical model.
An additional property that cannot be considered without coupling to a dynamical model is the effect of the choice of neural network loss function on the generated QBO. The standard loss function is zonal wind tendency (m s−2) vertically integrated across all levels with no vertical weighting. Other reasonable choices are to divide by pressure, which would emphasize the large mean absolute errors at high altitudes, or to multiply by pressure, which effectively minimizes momentum tendency errors. It is found that dividing by pressure increases the MAE at all heights and, further, a QBO is no longer simulated by the coupled system. Multiplying by pressure performs a bit better, reducing MAE errors at lower altitudes, but increasing them aloft and causing the QBO to become less coherent at high altitudes. The standard loss function is, therefore, found to be the best choice, and remains the one used.
While it would have been far too computationally expensive to tune all the neural network hyperparameters with a 3D climate model, use of a 1D coupled system has enabled extensive testing. Following these 1D system tests we were able to converge, with minimal computational expense, on a neural network that performs well in terms of producing the QBO. We now proceed with coupling this network to the climate model.
b. Climate model simulations
Two 20-yr simulations are run using the Met Office climate model detailed in section 2c, one using the existing nonorographic parameterization scheme, referred to as “param scheme,” and one using the neural network outlined above instead of the parameterization scheme, referred to as the “neural network” simulation. Details of the coupling of the neural network to the climate model for this latter simulation are given in section 2c. The aim of the neural network is to replicate the behavior of the nonorographic parameterization scheme, permitting the simulation of a QBO as well as including the impact of gravity wave tendencies on climatology and variability throughout the stratosphere in model simulations. We now look in detail at the similarities and differences between the param scheme and neural network simulations.
It is, of course, a prerequisite that using a machine learning algorithm in place of an existing parameterization scheme does not worsen existing model biases. The direct impact of zonal nonorographic gravity wave tendencies is on the zonal wind U, and we consider Northern Hemisphere winter [December–February (DJF)] climatologies as an example (Fig. 3). The stratospheric polar night jet in the model is slightly strong relative to reanalysis data but, reassuringly, is statistically indistinguishable between the param scheme and neural network simulations. Features of the climatological nonorographic gravity wave tendencies are also well captured by the neural network. For example, tendencies are positive at the base of eastward jets, as the gravity waves propagate upward and break due to critical line interactions where the zonal wind matches their phase speed, depositing their momentum and accelerating the winds. Tendencies are then negative higher up, as the remainder of the gravity wave spectrum saturates, leading to a deceleration or weakening of the winter jet (Scaife et al. 2002).
DJF climatologies for U in (a) ERA5 reanalysis data (Hersbach et al. 2020; Bell et al. 2021), and for (left) U and (right) NOGW for the 20-yr (b),(c) param scheme and (d),(e) neural network simulations.
Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-22-0081.1
It was further checked that the neural network did not adversely affect mean sea level pressure, as relatively small accelerations can have a large influence near the surface. However, as mentioned in section 2c, tendencies below 4 km (the parameterization scheme launch level) are not used, and so differences in mean sea level pressure climatologies are also found to be statistically indistinguishable between the two simulations.
The QBO in the neural network simulation is remarkably similar to that in the param scheme simulation, in terms of the QBO amplitude, variability, vertical structure and vertical extent (Fig. 4). The feature of the westerly phase lasting longer than the easterly phase in the lowermost stratosphere is also captured in both simulations. In general, there is far better agreement between the nature of the QBOs generated by the parameterization scheme and the neural network simulation than was the case in the 1D model simulations. It is possible that this is because the QBO is more confined by other dynamics (i.e., meridional circulations and realistic tropical upwelling) in the more complex 3D climate model. The period of the QBO is around 30 months in the param scheme simulation and around 27 months in the neural network simulation and in reanalysis data. Note, however, that there is substantial variability in the length of individual QBO periods in these 20-yr simulations, such that average periods of 30 months and 27 months are not significantly different. One subtle, and expected, difference between the simulations is that, when using a neural network, the gravity wave tendencies are nonzero below the altitude of the launch level used by the parameterization scheme. These nonzero tendencies below the launch level in the neural network simulation are not used (section 2c), and therefore, they do not feed back onto the zonal wind dynamics and they remain small.
Monthly mean zonal mean (left) U and (right) NOGW, area averaged 5°S–5°N, for (a) ERA5 reanalysis and (b),(c) param scheme and (d),(e) neural network simulations.
Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-22-0081.1
Zonal mean cross sections of the zonal wind and gravity wave tendencies regressed onto the QBO index (defined as the monthly mean zonal mean U (30 hPa), area averaged 5°S–5°N) demonstrate further that the vertical structure of the QBO is well captured by the neural network simulation (Fig. 5). In the same way as was seen in the winter mean climatologies (Fig. 3) gravity wave tendencies are positive at the base of the eastward jet, and then negative higher up in the tropical stratosphere. The structure and variability of the QBO itself was tested in the idealized 1D model experiments prior to coupling the neural network to the climate model, and so it is reasonable to hope that the network would perform well here. The extratropics, however, were not represented in this 1D model. Seasonal forecast skill due to the QBO comes primarily via the Holton–Tan mechanism (Holton and Tan 1980; Gray et al. 2018; Andrews et al. 2019), a teleconnection to the extratropics and, specifically, to the stratospheric polar vortex. This could only have been learned by the neural network from behavior during the single QBO period contained within the 2.5 years of model data used to train the network (section 2a). It is therefore of interest whether the neural network is able to capture the general nature of this teleconnection when averaged over all 20 years of the model simulations. For the westerly phase of the QBO shown in Fig. 5, the Holton–Tan mechanism should lead to an anomalously positive stratospheric polar vortex strength in the Northern Hemisphere. Teleconnections are generally difficult to model, and they rely on many aspects to be correctly simulated, for example small details in the model climatology, wave driving, and simulated physical processes. While the model does not fully capture the strength of this teleconnection, showing vortex anomalies approximately half the magnitude of those in the reanalysis, the neural network performs just as well as the original nonorographic gravity wave parameterization scheme [the gravity wave tendencies in Fig. 5 are produced entirely by either the parameterization scheme (Fig. 5c) or the neural network (Fig. 5e)].
(a) Zonal mean U in ERA5 reanalysis data, and (left) U and (right) NOGW for the (b),(c) param scheme and (d),(e) neural network simulations, all regressed on the QBO index, with the effects of ENSO regressed out (Gray et al. 2018). Here the QBO index is monthly mean zonal mean U (30 hPa), area averaged over 5°S–5°N, and the ENSO index is the detrended monthly mean 2-m air temperature area averaged over 5°S–5°N, 190°–240°E. The indices are calculated for the reanalysis and for both model simulations, although since the model is atmosphere only the ENSO index is virtually identical for all three cases. Regressions are performed across the 240 months of monthly mean data. Contours are on a logarithmic scale, taking absolute values between 0.5 and 32 in (a), (b), and (d) and taking absolute values between 0.005 and 0.32 in (c) and (e).
Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-22-0081.1
While the impacts on the QBO of including neural networks in climate models has been previously studied (Chantry et al. 2021b; Espinosa et al. 2022), the gravity wave variability associated with other modes of variability has been less extensively investigated. Along with a full QBO period, the 2.5 years of model data used to train the neural network also contains two La Niña winters (1995/96 and 1996/97) and one El Niño winter (1997/98) and also contains one Northern Hemisphere winter with a strong stratospheric polar vortex and one containing a sudden stratospheric warming (note that this details behavior in the model training data and not the reanalysis). In addition, the testing dataset (1992 and 1993) contains nearly a full QBO period, one Northern Hemisphere winter with a strong stratospheric polar vortex and one containing a sudden stratospheric warming. Thus, in terms of ENSO and polar vortex variability, as with the Holton–Tan teleconnection above, the neural network is trained on very few events, and we now consider how it performs on average over the 20-yr simulations.
The impact of ENSO on the annual mean polar stratosphere, as diagnosed from ERA5 data, is an anomalously weak or strong vortex in the Northern or Southern Hemisphere, respectively (Fig. 6a). As with the QBO teleconnection above, the model does not fully capture the strength of these ENSO teleconnections, but the neural network simulation performs just as well as the parameterization scheme (Fig. 6). The structure of the vortex anomalies is well captured by the neural network, as is the structure of the associated gravity wave tendencies, with negative tendencies at the base of the Northern Hemisphere vortex anomaly, and positive aloft. The Northern Hemisphere impacts, namely El Niño and La Niña respectively leading to a weak and strong winter stratospheric polar vortex, are well known (Brönnimann et al. 2004; Bell et al. 2009; Ineson and Scaife 2009; Domeisen et al. 2015). However, these teleconnections are generally considered to be due to planetary wave activity, and here we highlight the potential role of gravity waves. The impact of gravity waves on the stratospheric meridional circulation has been quantified across different climate models (Butchart et al. 2011) and, indeed, it has been shown in general that where gravity waves lead to a change in the climatological state, subsequent planetary wave forcing can be modified as a consequence (Cohen et al. 2013, 2014). Regardless of the initial driver, gravity waves clearly play a role (Figs. 6c,e) and this is well captured by the neural network.
(a) Zonal mean U in ERA5 reanalysis data, and (left) U and (right) NOGW for the (b),(c) param scheme and (d),(e) neural network simulations, all regressed on the ENSO index, with the effects of the QBO regressed out. Indices are defined as in Fig. 5. Regressions are performed across the 240 months of monthly mean data.
Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-22-0081.1
The central date for a sudden stratospheric warming (SSW) is defined by Charlton and Polvani (2007) as the day on which the zonal mean zonal wind U (60°N, 10 hPa) becomes negative (easterly). Using daily data for December–February (90 days × 20 years = 1800 days), we similarly define weak polar vortex days, or “SSW days,” as all those days for which U (60°N, 10 hPa) < 0. Strong polar vortex (SPV) days, or “SPV days,” are defined as days for which U (60°N, 10 hPa) > 53 m s−1, where 53 m s−1 is chosen so as to give similar sized composites for SSW days and SPV days (91 SSW and 96 SPV days for the param scheme simulation, 111 SSW and 124 SPV days for the neural network simulation). Composites of daily mean gravity wave tendency are then formed over these SSW and SPV days for both simulations (Fig. 7). Again, the neural network is found to perform well, simulating gravity wave tendencies comparable to those from the parameterization scheme for both SSW and SPV days. At 60°N, on SSW/SPV days there is negative/positive forcing from the tropopause to around 32 km (10 hPa) and positive/negative forcing above, as expected. As with ENSO teleconnections, planetary waves are traditionally regarded as driving anomalies in the polar vortex strength, but here we highlight that gravity waves also play a role.
Composites of NOGW for (left) SSW days and (right) SPV days for the (a),(b) param scheme and (c),(d) neural network simulations.
Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-22-0081.1
4. Conclusions
This study uses machine learning techniques to emulate the behavior of the nonorographic gravity wave parameterization scheme in the Met Office climate model. In our case, a dilated convolutional neural network is found to be the most accurate machine learning algorithm for this task. While the existing parameterization scheme requires six input variables, using just zonal wind and buoyancy frequency as inputs to the neural network is sufficient. Additional input variables add to the computational cost of running the neural network without significantly further reducing its mean absolute error. The fact that just two input variables are required to emulate the behavior of the existing parameterization scheme hints at potential efficiency savings that could be made within this scheme.
When trained using just 2.5 years of climate model data, and coupled to a one-dimensional mechanistic model centered on the equator, the neural network reproduces a quasi-biennial oscillation (QBO) of the stratospheric zonal mean zonal wind, with a period, amplitude, and vertical structure comparable to that generated using the existing parameterization scheme. The ability of the neural network to generate a QBO is essential if it is to replace a nonorographic gravity wave scheme in a climate model. The one-dimensional coupled model framework allows us to assess this ability, along with the impact of neural network hyperparameters, amount of training data and choice of loss function on emergent properties such as the period and structure of the simulated QBO, in a simple framework and with minimal computational expense. It allows the physicality and stability of the neural network to be assessed ahead of coupling to a climate model. For example, one month of training data was found insufficient to simulate a QBO with the correct amplitude, and coupling the neural network to a climate model at that stage would almost certainly have resulted in instability.
The neural network has been successfully coupled to the Met Office climate model, using libtorch, C++, and Fortran bindings. Two 20-yr simulations, one using the existing nonorographic parameterization scheme and one using the neural network in place of this parameterization scheme, have been compared. The neural network was found to perform in a way that is comparable to the existing parameterization scheme in terms of simulating a QBO, the teleconnection of this QBO to the stratospheric polar vortex, and gravity wave variability due to ENSO teleconnections and stratospheric polar vortex strength. To our knowledge, it has not previously been shown that a neural network is capable of correctly simulating the gravity wave variability associated with ENSO and polar vortex strength. Such variability is essential for accurate seasonal forecasts of surface climate. In addition, the network was trained on data containing just one ENSO cycle and one weak/strong polar vortex event, and yet was able to correctly simulate the average impact of ENSO and the polar vortex strength on gravity wave tendencies across the 20-yr model simulations. Since each ENSO/polar vortex event is different, this is effectively an out-of-sample test, and the neural network performs well.
Note that while the parameterization scheme conserves momentum the neural network does not. It was felt that, whether by including as a soft constraint (i.e., nudging toward conservative values) or a hard constraint (imposing momentum conservation), this would be an unnecessary complication since the output and performance of the neural network is found to be comparable to the parameterization scheme even without this constraint. This lack of conservation does not appear to have negatively impacted the climate model simulation, at least in terms of the diagnostics considered here in present-day simulations. It should also be remembered that, as with the parameterization scheme, any change to the underlying climate model resolution would necessitate retraining/retuning the neural network. Also, only zonal gravity wave tendencies have been computed by the neural network. The climate model simulation running with the neural network had zero meridional nonorographic gravity wave forcing, although this is not found to affect any of the diagnostics considered in this paper. Future versions of the neural network could consider automatic recalibration for different model resolutions (i.e., transfer learning; Guan et al. 2022). An additional neural network could also be included, to compute the meridional tendencies.
In general, based on the diagnostics we have considered, the neural network simulates gravity wave tendencies with comparable accuracy to the parameterization scheme. Currently, the neural network runs more slowly than the parameterization scheme. This would need to be improved upon before the neural network could run operationally, in addition to testing the impact (if any) of the network on other model diagnostics, for example precipitation/clouds/radiation. Nevertheless, use of a neural network opens the possibility of improving the representation of gravity wave forcing in climate models in ways that would be very challenging for a parameterization scheme. For example, the parameterization scheme operates only in the vertical, on a one-dimensional column of model data. However, it is well known that the horizontal (or lateral) propagation of gravity waves is also important (Alexander et al. 2010; Plougonven et al. 2020; Ribstein et al. 2022). By using observational data, or high resolution gravity wave resolving model data, to train a neural network, the influence of the horizontal propagation of gravity waves would be implicitly included (Amemiya and Sato 2016), as would the influence of any gravity wave sources not currently included in parameterization schemes.
Acknowledgments.
This research was made possible by Schmidt Futures, a philanthropic initiative founded by Eric and Wendy Schmidt, as part of the Virtual Earth System Research Institute (VESRI).
Data availability statement.
ERA5 data are available from the Copernicus Climate Data Store (https://cds.climate.copernicus.eu/). Data and code for producing all figures in this paper, along with the 1D model and neural network code, can be found online (https://zenodo.org/record/7737883).
APPENDIX
The dependence of the dilated convolutional neural network normalized MAE on the number of randomly chosen input samples used to train the network is shown for an early version of the neural network using just zonal wind U as input data (Fig. A1) and the final version of the neural network as coupled to the Met Office climate model using both U and NBV as input data (Fig. A2). In both cases, the vast majority of the decrease in MAE is found when using around 300 000–400 000 input samples, with a far more gradual decrease in MAE when using more random samples.
Dependence of normalized MAE on number of input samples, for a neural net trained on U. (b) The area marked by the orange-outlined box in (a).
Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-22-0081.1
As in Fig. A1, but for a neural net trained on U and NBV.
Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-22-0081.1
REFERENCES
Alexander, M. J., and Coauthors, 2010: Recent developments in gravity-wave effects in climate models and the global distribution of gravity-wave momentum flux from observations and models. Quart. J. Roy. Meteor. Soc., 136, 1103–1124, https://doi.org/10.1002/qj.637.
Amemiya, A., and K. Sato, 2016: A new gravity wave parameterization including three-dimensional propagation. J. Meteor. Soc. Japan, 94, 237–256, https://doi.org/10.2151/jmsj.2016-013.
Andrews, M. B., J. R. Knight, A. A. Scaife, Y. Lu, T. Wu, L. J. Gray, and V. Schenzinger, 2019: Observed and simulated teleconnections between the stratospheric quasi-biennial oscillation and Northern Hemisphere winter atmospheric circulation. J. Geophys. Res. Atoms., 124, 1219–1232, https://doi.org/10.1029/2018JD029368.
Bell, B., and Coauthors, 2021: The ERA5 global reanalysis: Preliminary extension to 1950. Quart. J. Roy. Meteor. Soc., 147, 4186–4227, https://doi.org/10.1002/qj.4174.
Bell, C. J., L. J. Gray, A. J. Charlton-Perez, M. M. Joshi, and A. A. Scaife, 2009: Stratospheric communication of El Niño teleconnections to European winter. J. Climate, 22, 4083–4096, https://doi.org/10.1175/2009JCLI2717.1.
Brönnimann, S., J. Luterbacher, J. Staehelin, T. M. Svendby, G. Hansen, and T. Svenøe, 2004: Extreme climate of the global troposphere and stratosphere in 1940–42 related to El Niño. Nature, 431, 971–974, https://doi.org/10.1038/nature02982.
Bushell, A. C., N. Butchart, S. H. Derbyshire, D. R. Jackson, G. J. Shutts, S. B. Vosper, and S. Webster, 2015: Parameterized gravity wave momentum fluxes from sources related to convection and large-scale precipitation processes in a global atmosphere model. J. Atmos. Sci., 72, 4349–4371, https://doi.org/10.1175/JAS-D-15-0022.1.
Butchart, N., and Coauthors, 2011: Multimodel climate and variability of the stratosphere. J. Geophys. Res., 116, D05102, https://doi.org/10.1029/2010JD014995.
Chantry, M., H. Christensen, P. Dueben, and T. Palmer, 2021a: Opportunities and challenges for machine learning in weather and climate modelling: Hard, medium and soft AI. Philos. Trans. Roy. Soc., A379, 20200083, https://doi.org/10.1098/rsta.2020.0083.
Chantry, M., S. Hatfield, P. Dueben, I. Polichtchouk, and T. Palmer, 2021b: Machine learning emulation of gravity wave drag in numerical weather forecasting. J. Adv. Model. Earth Syst., 13, e2021MS002477, https://doi.org/10.1029/2021MS002477.
Charlton, A. J., and L. M. Polvani, 2007: A new look at stratospheric sudden warmings. Part I: Climatology and modeling benchmarks. J. Climate, 20, 449–469, https://doi.org/10.1175/JCLI3996.1.
Cohen, N. Y., E. P. Gerber, and O. Bühler, 2013: Compensation between resolved and unresolved wave driving in the stratosphere: Implications for downward control. J. Atmos. Sci., 70, 3780–3798, https://doi.org/10.1175/JAS-D-12-0346.1.
Cohen, N. Y., E. P. Gerber, and O. Bühler, 2014: What drives the Brewer–Dobson circulation? J. Atmos. Sci., 71, 3837–3855, https://doi.org/10.1175/JAS-D-14-0021.1.
Domeisen, D. I. V., A. H. Butler, K. Fröhlich, M. Bittner, W. A. Müller, and J. Baehr, 2015: Seasonal predictability over Europe arising from El Niño and stratospheric variability in the MPI-ESM seasonal prediction system. J. Climate, 28, 256–271, https://doi.org/10.1175/JCLI-D-14-00207.1.
Dueben, P. D., M. G. Schultz, M. Chantry, D. J. Gagne, D. M. Hall, and A. McGovern, 2022: Challenges and benchmark datasets for machine learning in the atmospheric sciences: Definition, status, and outlook. Artif. Intell. Earth Syst., 1, e210002, https://doi.org/10.1175/AIES-D-21-0002.1.
Espinosa, Z. I., A. Sheshadri, G. R. Cain, E. P. Gerber, and K. J. DallaSanta, 2022: Machine learning gravity wave parameterization generalizes to capture the QBO and response to increased CO2. Geophys. Res. Lett., 49, e2022GL098174, https://doi.org/10.1029/2022GL098174.
Garcia, R. R., A. K. Smith, D. E. Kinnison, A. de la Cámara, and D. J. Murphy, 2017: Modification of the gravity wave parameterization in the Whole Atmosphere Community Climate Model: Motivation and results. J. Atmos. Sci., 74, 275–291, https://doi.org/10.1175/JAS-D-16-0104.1.
Gardner, M. W., and S. R. Dorling, 1998: Artificial neural networks (the multilayer perceptron)—A review of applications in the atmospheric sciences. Atmos. Environ., 32, 2627–2636, https://doi.org/10.1016/S1352-2310(97)00447-0.
Goodfellow, I., Y. Bengio, and A. Courville, 2016: Deep Learning. MIT Press, 800 pp.
Gray, L. J., J. A. Anstey, Y. Kawatani, H. Lu, S. Osprey, and V. Schenzinger, 2018: Surface impacts of the quasi biennial oscillation. Atmos. Chem. Phys., 18, 8227–8247, https://doi.org/10.5194/acp-18-8227-2018.
Guan, Y., A. Chattopadhyay, A. Subel, and P. Hassanzadeh, 2022: Stable a posteriori LES of 2D turbulence using convolutional neural networks: Backscattering analysis and generalization to higher Re via transfer learning. J. Comput. Phys., 458, 111090, https://doi.org/10.1016/j.jcp.2022.111090.
Ham, Y.-G., J.-H. Kim, and J.-J. Luo, 2019: Deep learning for multi-year ENSO forecasts. Nature, 573, 568–572, https://doi.org/10.1038/s41586-019-1559-7.
Hersbach, H., and Coauthors, 2020: The ERA5 global reanalysis. Quart. J. Roy. Meteor. Soc., 146, 1999–2049, https://doi.org/10.1002/qj.3803.
Holton, J. R., and H.-C. Tan, 1980: The influence of the equatorial quasi-biennial oscillation on the global circulation at 50 mb. J. Atmos. Sci., 37, 2200–2208, https://doi.org/10.1175/1520-0469(1980)037<2200:TIOTEQ>2.0.CO;2.
Ineson, S., and A. A. Scaife, 2009: The role of the stratosphere in the European climate response to El Niño. Nat. Geosci., 2, 32–36, https://doi.org/10.1038/ngeo381.
Irrgang, C., N. Boers, M. Sonnewald, E. A. Barnes, C. Kadow, J. Staneva, and J. Saynisch-Wagner, 2021: Towards neural Earth system modelling by integrating artificial intelligence in Earth system science. Nat. Mach. Intell., 3, 667–674, https://doi.org/10.1038/s42256-021-00374-3.
Kretschmer, M., S. V. Adams, A. Arribas, R. Prudden, N. Robinson, E. Saggioro, and T. G. Shepherd, 2021: Quantifying causal pathways of teleconnections. Bull. Amer. Meteor. Soc., 102, E2247–E2263, https://doi.org/10.1175/BAMS-D-20-0117.1.
Lindzen, R. S., and J. R. Holton, 1968: A theory of the quasi-biennial oscillation. J. Atmos. Sci., 25, 1095–1107, https://doi.org/10.1175/1520-0469(1968)025<1095:ATOTQB>2.0.CO;2.
Martin, Z. K., E. A. Barnes, and E. Maloney, 2022: Using simple, explainable neural networks to predict the Madden-Julian oscillation. J. Adv. Model. Earth Syst., 14, e2021MS002774, https://doi.org/10.1029/2021MS002774.
Matsuoka, D., S. Watanabe, K. Sato, S. Kawazoe, W. Yu, and S. Easterbrook, 2020: Application of deep learning to estimate atmospheric gravity wave parameters in reanalysis data sets. Geophys. Res. Lett., 47, e2020GL089436, https://doi.org/10.1029/2020GL089436.
Meyer, D., S. Grimmond, P. Dueben, R. Hogan, and M. van Reeuwijk, 2022a: Machine learning emulation of urban land surface processes. J. Adv. Model. Earth Syst., 14, e2021MS002744, https://doi.org/10.1029/2021MS002744.
Meyer, D., R. J. Hogan, P. D. Dueben, and S. L. Mason, 2022b: Machine learning emulation of 3D cloud radiative effects. J. Adv. Model. Earth Syst., 14, e2021MS002550, https://doi.org/10.1029/2021MS002550.
Nowack, P., J. Runge, V. Eyring, and J. D. Haigh, 2020: Causal networks for climate model evaluation and constrained projections. Nat. Commun., 11, 1415, https://doi.org/10.1038/s41467-020-15195-y.
O’Reilly, C. H., A. Weisheimer, T. Woollings, L. J. Gray, and D. MacLeod, 2019: The importance of stratospheric initial conditions for winter North Atlantic oscillation predictability and implications for the signal-to-noise paradox. Quart. J. Roy. Meteor. Soc., 145, 131–146, https://doi.org/10.1002/qj.3413.
Plougonven, R., A. de la Cámara, A. Hertzog, and F. Lott, 2020: How does knowledge of atmospheric gravity waves guide their parameterizations? Quart. J. Roy. Meteor. Soc., 146, 1529–1543, https://doi.org/10.1002/qj.3732.
Plumb, R. A., 1977: The interaction of two internal waves with the mean flow: Implications for the theory of the quasi-biennial oscillation. J. Atmos. Sci., 34, 1847–1858, https://doi.org/10.1175/1520-0469(1977)034<1847:TIOTIW>2.0.CO;2.
Ribstein, B., C. Millet, F. Lott, and A. de la Cámara, 2022: Can we improve the realism of gravity wave parameterizations by imposing sources at all altitudes in the atmosphere? J. Adv. Model. Earth Syst., 14, e2021MS002563, https://doi.org/10.1029/2021MS002563.
Russell, S. J., and P. Norvig, 2010: Artificial Intelligence: A Modern Approach. 3rd ed. Prentice Hall, 1152 pp.
Scaife, A. A., N. Butchart, C. D. Warner, D. Stainforth, W. Norton, and J. Austin, 2000: Realistic quasi-biennial oscillations in a simulation of the global climate. Geophys. Res. Lett., 27, 3481–3484, https://doi.org/10.1029/2000GL011625.
Scaife, A. A., N. Butchart, C. D. Warner, and R. Swinbank, 2002: Impact of a spectral gravity wave parameterization on the stratosphere in the Met Office Unified Model. J. Atmos. Sci., 59, 1473–1489, https://doi.org/10.1175/1520-0469(2002)059<1473:IOASGW>2.0.CO;2.
Scaife, A. A., and Coauthors, 2014: Predictability of the quasi-biennial oscillation and its northern winter teleconnection on seasonal to decadal timescales. Geophys. Res. Lett., 41, 1752–1758, https://doi.org/10.1002/2013GL059160.
Walters, D., and Coauthors, 2019: The Met Office Unified Model Global Atmosphere 7.0/7.1 and JULES Global Land 7.0 configurations. Geosci. Model Dev., 12, 1909–1963, https://doi.org/10.5194/gmd-12-1909-2019.
Warner, C. D., and M. E. McIntyre, 1999: Toward an ultra-simple spectral gravity wave parameterization for general circulation models. Earth Planets Space, 51, 475–484, https://doi.org/10.1186/BF03353209.
Warner, C. D., and M. E. McIntyre, 2001: An ultrasimple spectral parameterization for nonorographic gravity waves. J. Atmos. Sci., 58, 1837–1857, https://doi.org/10.1175/1520-0469(2001)058<1837:AUSPFN>2.0.CO;2.
Wu, Y., Z. Sheng, and X. J. Zuo, 2022: Application of deep learning to estimate stratospheric gravity wave potential energy. Earth Planet. Phys., 6, 70–82, https://doi.org/10.26464/epp2022002.