1. Introduction
Many atmospheric processes depend on the movement of individual air parcels at scales ranging from tens to hundreds of meters. General circulation models (GCMs) typically operate at horizontal resolutions on the order of 100 km (IPCC 2013). Reanalysis and short-term numerical weather prediction (NWP) that require skill in reproducing storm dynamics typically use a higher resolution, spanning from about 10 km for NWP (Johnson et al. 2019) to about 50 km for subseasonal and seasonal forecast (e.g., Molod et al. 2020). Resolving convective transport, gravity wave motion, cloud and aerosol microphysics, and turbulent mixing often require meter-scale resolution, and they are still heavily parameterized, even in NWP applications (Bauer et al. 2015). Given the typical horizontal resolution of atmospheric models (∼10–100 km), it is likely that multiple ascending air parcels can be found within each grid cell, each one driven by its own vertical velocity W. This leads to a subgrid distribution of vertical wind velocities, characterized by a standard deviation σW.
The parameterization of σW plays a crucial role in accurately representing clouds and their interaction with aerosol emissions. Aerosol activation into cloud droplets and ice crystals results from the generation of supersaturation in ascending parcels. Gridscale cloud formation rates are obtained by integrating over the subgrid spectrum of vertical wind velocities, determined by σW (Pruppacher and Klett 1997). Variability in σW accounts for about 70% of the total variability in ice crystal and droplet formation rates (Sullivan et al. 2016). Uncertainty in σW thus translates directly into uncertainty in cloud representation, with profound implications for climate predictions (IPCC 2013; Seinfeld et al. 2016).
Atmospheric models traditionally rely on episodic campaign data (Peng et al. 2005; Shi and Liu 2016) and empirical approximations (Morrison et al. 2005; Ghan et al. 1997; Joos et al. 2008; Dean et al. 2007) to estimate σW. More reliable estimates can be obtained using modern turbulence/convection schemes (e.g., Bogenschutz et al. 2013; Lopez-Gomez et al. 2020), as, for example, those adopted in the most recent version of the Community Earth System Model (Danabasoglu et al. 2020). These higher-order schemes, however, depend on numerous parameters that can be challenging to constrain and significantly increase computational cost in climate simulations (Guo et al. 2014). Moreover, these schemes are primarily designed to represent warm, shallow, and stratocumulus clouds, which are strongly influenced by boundary layer turbulence. Consequently, they may tend to underestimate σW in high-level clouds that are impacted by orography and convection (Barahona et al. 2017; Patnaude et al. 2021).
Spatial variability in W at typical GCM resolutions can also be estimated by downsampling high-resolution simulations, like, for example, large-scale eddy simulations (LES) as they explicitly resolve vertical air motion at the scale of a few meters (Lenschow et al. 2012). Due to their computational expense, LES simulations are, however, limited to small domains. Another approach is to use global cloud-resolving models (GCRMs), which explicitly resolve kilometer-scale atmospheric motion and its interaction with cloud microphysics (Satoh et al. 2019). GCRMs provide global coverage at a higher spatial resolution than typical GCMs, although still coarser than LES. GCRMs work by either embedding high-resolution two-dimensional cloud-resolving models within a coarser GCM grid (Terai et al. 2020), or by direct downscaling of the model physics using nonhydrostatic dynamics (Fudeyasu et al. 2008; Putman and Suarez 2011). GCRMs offer significant advantages over traditional, low-resolution GCMs, since they are able to explicitly resolve convection and better link transport, cloud and aerosol microphysical processes, and atmospheric air motion. However, conducting GCRM simulations requires substantial technical resources (Satoh et al. 2019). As a result, most GCRM simulations span relatively short periods, typically a few weeks, limiting their ability to represent climate seasonality (Judt et al. 2021; Satoh et al. 2019). Alternatively, slightly coarser global storm-resolving model (GSRM) simulations can span longer periods, up to a few years (Putman and Suarez 2011).
While downscaling high-resolution simulations offers insights into the climatological behavior of σW, it poses challenges in generating state-dependent parameterizations for standard GCMs. However, these challenges can be overcome by employing artificial neural networks (ANNs) (Rasp et al. 2018; Gettelman et al. 2021; Mooers et al. 2020; Beucler et al. 2021). ANNs have the capability of synthesizing large volumes of data into a compressed representation, while retaining the most significant relationships within the dataset (Goodfellow et al. 2016; LeCun et al. 2015; Schmidhuber 2015). For instance, Rasp et al. (2018) trained an ANN using GCRM output to parameterize subgrid-scale variability of moisture in a GCM, resulting in improved representation of tropical precipitation. The ANN, however, lacked generalization skill for temperatures outside of the training data manifold. Recent proposed architectures show potential in improving the stability of subgrid ANN parameterizations (Lopez-Gomez et al. 2022; Iglesias-Suarez et al. 2023). ANNs are data-driven algorithms, meaning that predictions are made via the identification of mapping functions to transform inputs as opposed to numerical simulations or theoretical models. Successful deep learning applications often require data on the petascale (1024 terabytes) and exascale (1024 petabytes) (Chi et al. 2016; Laney 2001).
Despite their strength in representing small-scale processes, GCRMs exhibit biases in the representation of turbulence, shallow convection, and cloud microphysics (Roh et al. 2021). To address this, Beucler et al. (2021) demonstrated that enforcing energy and mass conservation during training improves the stability and accuracy of simulations incorporating subgrid-scale ANN parameterizations. However, global constraints such as top of the atmosphere radiative balance may compel ANN parameterizations to compensate for errors in other parts of the GCM as opposed to refining predictions of interest. Additionally, ANN models aiming to replace entire GCM components cannot be directly evaluated against observations, but only as part of the complete GCM simulation. To address these challenges, adopting a process-level approach is desirable, enabling the evaluation of individual parameterizations against experimental observations. Merging observational data with GCRM output during the training of surrogate ANN models may also reduce biases resulting from deficiencies in theoretical models.
Generative algorithms offer a promising approach to incorporate subgrid physics inherent in observations into ANN models while mitigating the impact of experimental errors. These algorithms aim to train ANNs by aligning with the data distribution, rather than simply learning the data representation (Zeng et al. 2021). Global statistics tend to be more resilient to nonsystematic experimental errors compared to individual values, making them valuable for guiding the training process of the ANN. Among the various generative models, the introduction of generative adversarial networks (GANs) has significantly improved accuracy and efficiency (Goodfellow et al. 2016). GANs employ a novel framework where two ANNs, a generator and a discriminator, engage in a “competition” during probabilistic training. The generator produces examples that the discriminator either accepts or rejects based on its own learned representation of the target data distribution (LeCun et al. 2015). GANs can be formulated as semi- or fully supervised algorithms (Mirza and Osindero 2014); more recent advances have improved their stability and convergence (Radford et al. 2015; Arjovsky et al. 2017; Zhu et al. 2017; Berthelot et al. 2017; Creswell et al. 2018; Pan et al. 2020). GANs have been widely successful in various domains including computer graphics and natural language processing (Creswell et al. 2018), and found applications in developing physical models (Willard et al. 2020), and in weather and climate prediction (Leinonen et al. 2019; Bihlo 2021; Besombes et al. 2021).
In this work, we propose a novel generative approach to develop an ANN representation of the subgrid distribution of vertical wind velocity. Our method involves combining W retrievals from various sources, global storm-resolving simulations, and reanalysis products. The key aspect of our approach is the integration of observational constraints directly within the ANN parameterization. Ground-based remote sensing stations worldwide have collected extensive high-frequency radar and lidar measurements, enabling the retrieval of W (Kalesse and Kollias 2013; Giangrande et al. 2016; Newsom et al. 2019). Although these measurements span different time periods and have limited spatial coverage, they collectively provide nearly 100 years of W retrievals at a sampling frequency from 2 s to 5 min. We leverage this wealth of observational data to enforce constraints on the ANN model.
2. Components and data
Our parameterization approach uses output from high-resolution simulations, reanalysis products, and observational datasets. These are detailed in this section.
a. The NASA GEOS model and MERRA-2
The NASA Goddard Earth Observing System (GEOS) consists of a set of components that numerically represent different aspects of the Earth system (atmosphere, ocean, land, sea ice, and chemistry), coupled following the Earth System Modeling Framework (https://gmao.gsfc.nasa.gov/GEOS_systems/). In the AGCM mode, atmospheric transport of water vapor, condensate, and other tracers, and associated land–atmosphere exchanges, are computed explicitly, whereas sea ice fraction and sea surface temperature are prescribed as time-dependent boundary conditions (Reynolds et al. 2002; Rienecker et al. 2008). Transport of aerosols and gaseous tracers such as CO are simulated using the Goddard Chemistry Aerosol and Radiation model (GOCART; Chin et al. 2002; Colarco et al. 2010). Cloud microphysics is described using a two-moment scheme where the mixing ratio and number concentration of cloud droplets and ice crystals are prognostic variables for stratiform clouds (i.e., cirrus, stratocumulus) and convective clouds (Barahona et al. 2014; Tan and Barahona 2022). GEOS has been shown to reproduce the global distribution of clouds, radiation, and precipitation in agreement with satellite retrievals and in situ observations (Barahona et al. 2014), and it is used operationally in subseasonal and seasonal forecast prediction (Molod et al. 2020).
The second version of the Modern-Era Retrospective Analysis for Research and Applications (MERRA-2) was the first multidecadal reanalysis where aerosol and meteorological observations are jointly assimilated (Gelaro et al. 2017; Randles et al. 2017). GEOS forms the core model of MERRA-2, which is constrained by ingesting a wealth of data from satellite, ground-based, and aircraft observations using the NASA Data Assimilation (Rienecker et al. 2008) and Aerosol Assimilation Systems (Randles et al. 2017). Because it is highly constrained by observations, MERRA-2 can be collocated in time and space with field retrievals (Gelaro et al. 2017). Section 3b details how this feature allows us to build an ANN model that directly incorporates observational data.
b. GEOS global storm-resolving simulations
Over the last decade, the NASA Global Modeling and Assimilation Office has performed a series of global nonhydrostatic integrations of the NASA GEOS model (Putman et al. 2015) as part of the Dynamics of the Atmospheric General Circulation Modeled on Nonhydrostatic Domains (DYAMOND) project (Stevens et al. 2019). In “nature” mode, these simulations use a free-running configuration of GEOS with no constraint by observations other than climatological sea surface temperatures (most recently, a fully coupled atmosphere–ocean, 5-km resolution, GSRM simulation has been achieved). Because of this, they are good at representing simulated variability associated with physical relationships within a model, but that can only be achieved through massive computational expense. The longest of these runs spanned two years, at a horizontal resolution of 7 km, and it is referred as the GEOS-5 Nature Run (G5NR; Gelaro et al. 2015).
Although G5NR has a slightly lower resolution than current GSRMs, it stands as the only multiyear kilometer-scale simulation achieved thus far. However, G5NR does not resolve boundary layer processes and generally underestimates σW. This is to be expected as a significant portion of the variability in W originates at scales smaller than 7 km. Nevertheless, G5NR captures the impact of large-scale features that trigger small-scale turbulence such as the enhancement of σw over mountain ranges, convective systems, and along jet streams (Barahona et al. 2017). These large-scale effects are challenging to discern solely from ground-based data due to their limited spatial coverage. In this context, G5NR provides a valuable reference derived from physical principles that would enable the ANN parameterization to extrapolate beyond local environments, complementing the observational data.
c. Vertical wind velocity retrievals
We use reported and new W retrievals from ground-based Doppler radar (DR) and Doppler lidar (DLi) instruments at different locations around the world. Both DR and DLi operate in a similar fashion, where backscattered pulses of electromagnetic energy are analyzed in time and space to retrieve W. In DR, the observed mean Doppler velocity is decomposed into the air velocity and the reflectivity-weighted hydrometeor velocity. DR depends on the presence of hydrometeors; hence, W can only be retrieved when clouds are present. Because DLi is sensitive to both clouds and aerosols, W can be retrieved in clear-sky conditions, although it is usually confined to the planetary boundary layer where micrometer-size particles are abundant (Newsom et al. 2019). To take advantage of the strengths of each technique, we have selected 11 diverse sites corresponding to different meteorological conditions, seasonality, and orographic features. Put together, they correspond to more than a 100 years of continuous W retrievals.
Table 1 describes the data available at each of the selected sites; their locations are depicted in Fig. 1. Most datasets were obtained from the Atmospheric Radiation Measurement archive (http://www.archive.arm.gov/). The sites at Leipzig, Germany (LEI), and Limassol, Cyprus (LIM), were collected with the Leipzig Aerosol and Cloud Remote Observations System (LACROS; Bühl et al. 2013) within the Cloudnet network (Illingworth et al. 2007) and are reported for the first time in this work. We have selected datasets that span for at least a year, to have a sufficiently large dataset to train the ANN. In general, all the lidar datasets correspond to retrievals done within the planetary boundary layer (Newsom et al. 2019; Berg et al. 2017), whereas the radar retrievals predominantly focus on ice clouds. The only exception is the MAO dataset (Giangrande et al. 2016), which is radar-based and focuses on convective clouds. When σW is not reported (which is particularly the case for Doppler radar datasets), we use the average horizontal wind velocity at each site to calculate σW from the retrieved W (Illingworth et al. 2007).
All datasets were interpolated to match the MERRA-2 vertical grid and filtered for outliers, defined as data outside 2.5 standard deviations for each site. To balance the training set, data were augmented by repeating the LIM, LEI, MAN, and MAO sites four times, introducing a 1% random perturbation each time. These sites correspond to cirrus and convective conditions that are underrepresented in the collected dataset. It is challenging to unambiguously split the data to ensure a clear separation between the training and test sets while incorporating observations from different sites with distinct conditions. We adopted a sequential splitting approach, where the observational data were divided into training and testing sets based on time periods. Specifically, we allocated the first 80% of the time periods for training and the last 15% for testing, with a 5% gap to prevent overlap.
3. Parameterization approach
The goal of this work is to develop an ANN parameterization that uses low-resolution state input from a GCM to estimate σW. This is accomplished using a two-stage approach. In the first step, we build an ANN, termed “Wnet-prior,” as a surrogate model of the high-resolution G5NR output (section 3a). In the second step, Wnet-prior is incorporated as the first layers of a second ANN, termed “Wnet” (section 3b), which is trained using an adversarial approach. Wnet constitutes the final parameterization, constrained both by the field data and by the physical relations implicit in G5NR. This methodology is depicted in Fig. 2 and detailed below.
a. Generation of an ANN representation of σW from global storm-resolving simulations: Wnet-prior
Figure 2, top left, depicts the development of Wnet-prior. On the “input” side (blue), the G5NR data are downsampled by averaging over a nominal resolution, that is, about 0.5°, to represent the variables resolved in a low-resolution GCM. On the “output” side (green), σW (m s−1) is calculated directly from W using about 64 values for each 0.5° cell, which constitutes the target values used for training and validation. Prior experience in upper-tropospheric clouds (Barahona et al. 2017), as well as theoretical considerations, suggested that σW is dependent on orography, turbulence, convection, winds, and thermodynamics. Based on this, and by trial and error, we selected a set of inputs, XG5NR, at the coarse resolution to train the ANN. These include the Richardson number (Ri, dimensionless), total scalar diffusivity for momentum (Km, in m2 s−1), the three-dimensional wind velocity (U, V, and W, in m s−1), the water vapor, liquid, and ice mass mixing ratios (Qυ, Ql, and Qi in kg kg−1), air density (ρa, in kg m−3), and air temperature (T, in K). These variables are found in typical GCM output.
Wnet-prior was designed to work on individual grid cells. That is, to predict a single scalar, σW, from a one-dimensional input vector. Ideally, using three-dimensional (3D) input fields may inform the ANN of large-scale spatial correlations affecting σW, as for example, orography, convection, and teleconnections (Kärcher and Podglajen 2019). However, it might also make the parameterization resolution-dependent limiting its application in different models. Another caveat is that field observations used to constrain the ANN (section 3b) would have to have global 3D coverage, which is almost never the case. Even using two-dimensional features (i.e., atmospheric columns) as input (e.g., Rasp et al. 2018) results in “gaps” in the prediction during the refinement step (section 3b), for vertical levels where most of the observational datasets are not available, such as between 4 and 6 km of altitude (not shown). However, when comparing the 2D and 1D models, it was observed that the impact of orography on σW can be effectively approximated by incorporating a specific set of surface variables into the input of the ANN at all levels. Through trial and error (details not shown), we determined that the optimal set of surface variables for this purpose consists of Km, Qυ, ρa, and Ri. As a result, the final input set for the ANN was a 14-dimensional vector, including these variables. The training and optimization procedure for Wnet-prior is detailed in A1.
Data used for training, validation, and testing were randomly selected from the G5NR data without replacement: training data were selected from the years 2005–06 of the simulation and testing from the year 2007. Training data are used to optimize the mapping between input features and targets. At every epoch (training iteration), it is common to evaluate the current state of the ANN on a dataset for which the targets are known. Training is stopped based on criteria defined using the validation loss. The trained/validated model is then applied to the test set, consisting of 20 half-hourly output files of randomly selected output from G5NR (∼3 × 108 samples), not used during training.
b. Refinement by constrained adversarial training: Wnet
Neural networks are typically trained to identify a nonlinear mapping between a feature vector X and the corresponding target vector Y, yielding the estimation
We adapt the cGAN architecture for parameterization development by using the input state to condition the observed and generated distributions so that they become Pobs(Y|X) and
GANs rely on introducing sufficient variability to the ANN, often achieved through the use of random noise. Mirza and Osindero (2014) specifically focused on generating examples that conform to a target distribution by incorporating random noise as input. In contrast, our objective is to develop an ANN regression that accurately captures the target statistics, taking into account the random experimental error associated with the target values. While the input state remains deterministic, the target values exhibit stochastic behavior. Therefore, it is more appropriate to introduce random variability in Y, allowing the target values to distribute around the corresponding states. To address this, we introduce a random perturbation, typically between 1% and 5%, to Y during training. By doing so, the GAN algorithm effectively compels the generator to train on the more probable target values given the input state.
Generally, the loss function of the discriminator is a measure of the the statistical distance between the estimated and the target distributions. It is originally based on the Kullback–Leibler divergence (Goodfellow et al. 2014), although a number of other functions have been proposed (Pan et al. 2020). The loss function of the generator is typically written so that it would maximize the loss of the discriminator, hence setting up the adversarial training as a minimax game. Although highly effective, this procedure suffers form the caveat that there is no clear convergence criterion (Berthelot et al. 2017).
Figure 2 (bottom) shows the cGAN architecture. The generator combines the hidden layers from Wnet-prior (nontrainable, preserves physics learned from G5NR) with two new, trainable layers—one to transform and the other to output. The output of the generator feeds the discriminator, which outputs a single scalar indicating the probability that the generated σw is within Pobs(Y|X). After the CAT optimization, the trained generator constitutes the final parameterization of σw. In the final architecture, the generator is an ANN with 6 dense layers with 128 nodes each, and a single-node dense layer as the output. The discriminator architecture consists of an encoder (3 dense layers with 128, 64, and 32 nodes, respectively) to compress the data into the latent space (consisting of a single 8-node dense layer), and a decoder (3 dense layers with 32, 64, and 128 nodes, respectively) to sample from the latent space and reconstruct the data, and a single-node dense layer in the output. Although slightly different than the traditional application of an encoder–decoder architecture (Billault-Roux et al. 2023), this architecture aligns with the concept of the discriminator acting as a pseudoautoencoder, even though it ultimately outputs a probability score. Binary cross entropy is used as the statistical distance,
c. Alternative models
We developed three alternative models to investigate the role of different aspects of the CAT algorithm on the ANN performance. The first model, termed “Obs-only,” utilized supervised learning and was trained directly on the observational data without incorporating any information from G5NR. This model focused solely on learning from the observed data.
In the “Transfer” model, we extended the pretrained Wnet-prior model by adding a trainable layer. The final layer of the ANN was then trained using supervised learning with the observational data, resembling a typical transfer learning approach (Daw et al. 2017).
The “EMD” model followed a similar architecture as Wnet, but instead of using binary cross entropy as the loss function (as in Wnet), it employed the Earth mover’s distance (EMD) in Eqs. (3) and (4) (Arjovsky et al. 2017). Neither gradient clipping nor spectral normalization (Miyato et al. 2018) was applied to the EMD function, on the rationale that the conditional state helps to stabilize the training process, which was indeed observed during experimentation.
The Obs-only and Transfer models represent alternative parameterizations that do not aim to precisely reproduce the observed distribution; they focus on incorporating the observed data. On the other hand, the EMD model, despite being PDF-based, acts as an integral constraint to
ANN models developed in this work.
4. Results
To test the effectiveness of the training procedure outlined above, we assessed the accuracy of Wnet-prior at reproducing G5NR output and the skill of Wnet at reproducing the observed statistics at each site.
Wnet-prior captures the global spatial distribution of σw predicted by G5NR. Figure 4 compares σw obtained from G5NR and the Wnet-prior model calculated on the test set. Different processes govern the distribution of σw at different atmospheric levels. Near the surface (900 hPa, Fig. 4), wind shear near the continental shores enhances variability in W. Similarly, shallow convection enhances σw, evident in the tropics and the storm tracks of the midlatitudes of the Southern Hemisphere. G5NR depicts such dependencies as a result of the resolved synoptic-scale motion (Stull 1988). These are well reproduced by Wnet-prior. However, the main factors driving σw in the planetary boundary layer (PBL), that is, buoyancy and turbulence, cannot be resolved by the G5NR simulation as it would require much higher spatial resolution. As a result, σw is about an order of magnitude lower than typically observed values within the PBL [
At 500 hPa (Fig. 4), orographic features and deep convection are the main drivers of variability (Barahona et al. 2017; Dean et al. 2007). This is evident in the tropical oceans and over the Tibetan plateau, the Andes, and the west coast of North America where σw peaks in G5NR. Wnet-prior reproduces such patterns. It also accurately represents the minima in σw in the eastern equatorial Pacific cold tongue, and off the coasts of North and South America, associated with atmospheric stability and low sea surface temperature (Liu et al. 2019).
At 250 hPa, Wnet-prior slightly underestimates σw around the mountain regions, that is, Andes, Tibetan Plateau, and Himalayas. Wnet-prior predicts a weaker maxima in σw around mountain ranges. It is possibly a result of the lack of a spatial information in the input to Wnet-prior, which leads the ANN to underpredict the propagation of orographically induced gravity waves originated at the surface (McFarlane 1987; Dean et al. 2007; Barahona et al. 2017). This is suggested by tests (not shown) using a two-dimensional model (i.e., where each sample corresponds to an atmospheric column), which tended to represent better the peak σw in the upper troposphere. Such a model was deemed impractical to be formulated as a parameterization since it is dependent on the vertical resolution of MERRA-2.
For the test set, Wnet-prior reproduces the G5NR predictions with a mean bias of −0.004 ± 0.05 m s−1. The slight underestimation in σw by the ANN results from the tendency of Wnet-prior to underestimate the vertical propagation of gravity waves compared to G5NR. The ANN has to learn the dynamics of wave propagation from the state vector at each level instead from the whole atmospheric column. Multilayer perceptrons (MLPs) also have limited skill at elucidating spatial patterns in three-dimensional data. On the other hand, using a simple architecture eases its implementation in GCMs. By favoring the flexibility of the parameterization, we thus have subjected the ANN to a more challenging learning problem. Figure 4, however, shows that Wnet-prior is an accurate surrogate model of G5NR, providing a solid physical basis to the parameterization of σw.
Figure 5 compares the predictions of the Wnet ANN against observations for the test set (i.e., data not used during training), for all the sites of Table 1. For the different sites, Wnet reproduces the observed σw data with a normalized mean bias, Nmb, around ±15% (i.e., the mean bias divided by
Except for Wnet-prior, all the models shown in Fig. 5 represent reasonable parameterizations of σw. Wnet-prior tends to significantly underestimate the observed σw. This results from the limited ability of G5NR to explore the range of vertical wind velocities observed in nature. Even at the 7-km spatial resolution, the model misses a significant fraction of the observed variability, which is then inherited by Wnet-prior. Since there are different ways to develop the ANN parameterization, the other models of Table 2 aim to explore different aspects in which the CAT algorithm contributes to build a robust parameterization.“Obs-only” represents a direct approach, training on the observed data with no underlying physical constraints, whereas the “Transfer” model does not rely on adversarial training to ingest observations. Figure 5 shows that although reasonably reproducing the observed data, these two models tend to overpredict variability in σw, particularly for the radar sites [SGP (cirrus), MAN, LIM, and LEI]. The accuracy of the “EMD” model, which aims to test the impact of the function used to define the distance between
Further evidence that Wnet represents well the observed statistics of σw is presented in Fig. 6, which shows the probability distribution function (PDF) of σw for the different models of Table 2. The agreement between Wnet and the observations is obvious since there is almost overlap between the black and red curves of Fig. 6. Quantitatively, out of the models of Table 2, Wnet has the lowest Kolmogorov–Smirnov statistic (the largest absolute difference between two cumulative distributions) calculated against observations (Wilks 2011). The positive effect of ingesting observed data within the parameterization is evidenced by comparing against the PDF predicted by Wnet-prior, which trained on simulated data only. Without refinement by observational data, the PDF is much narrower and centered about an order of magnitude lower σw than the measured values. The CAT algorithm thus significantly improves the accuracy of the predicted PDF by bringing it closer to the measured distribution.
There is a slight discrepancy between the Wnet and observed PDFs for σw > 1, explained by bias on the onset of convection predicted by MERRA-2, which propagates to the σw prediction. This is further investigated by comparing the predicted PDF for individual sites against observations (Fig. A1). Across all sites, Wnet consistently outperforms all other models. There is some influence of target variable imbalance, where Wnet appears to perform better at sites with longer σW records. However, a more pronounced pattern is observed in the sites with the largest discrepancies between the predicted PDF and the observations, which are primarily located in the tropics (specifically MAO, PGH, and TWP). This observation suggests that the errors may be introduced due to biases in the prediction of convection by MERRA-2 in these regions and may impact the accuracy of σw predictions in areas with complex atmospheric dynamics such as convective systems.
Application of the Obs-only, Transfer, and EMD models results in similar distributions (Fig. 6). They are narrower than the observed PDF and characterized by two modes centered around σw ∼ 0.5 and σw ∼ 1.5 m s−1. Although also present in the data, the peak at σw ∼ 1.5 m s−1 is more subtle. In fact, the observed PDF would be well approximated by a lognormal distribution, an observation made for the first time in this work. The apparent second mode at higher σw results from the MAO dataset. As W in MAO is retrieved from the core of convective systems, it tends to show high σw values. The Obs-only, Transfer, and EMD models are highly impacted by these high values since there is a lack of observations at moderate σw, leading to the predicted bimodality. In contrast, it is remarkable that only Wnet can effectively capture the transition between the convective system-induced high σw and the more moderate σw values observed in other regions.
Figure 7 shows the global spatial distribution of σw predicted by the Wnet (left panels) and Obs-only models (right panels) for the test set, as in Fig. 4. Since no simulated data were used to train the Obs-only model, the comparison in Fig. 7 is a qualitative assessment of the effect of ingesting G5NR data in Wnet. In general, Wnet predicts about an order of magnitude higher σw than Wnet-prior. That is, Wnet accounts for the variability in W missing in G5NR and by extension in Wnet-prior. However, Wnet also inherits spatial structure in σw imposed by the physical constraints of the atmospheric model. This is evident over the mountain ranges of Asia and North and South America where Wnet (Fig. 7) and Wnet-prior (Fig. 4) display high σw at 500 and 250 hPa. Deep convection also leads to high σw in the tropical regions around the intertropical convergence zone (between 30°S and 30°N), and in the storm tracks of the Southern Hemisphere. This fine structure is largely missing in the predictions of the Obs-only model, which essentially lacks any significant effects from localized convection in the tropics at 500 hPa, and largely misses the effect of orography on σw at 250 hPa. This strongly suggests that the spatial structure depicted by Wnet in Fig. 7 is introduced by the model physics inherited by the incorporation of Wnet-prior into the ANN (Fig. 2).
5. Discussion
A premise of this work is that robust parameterizations can be developed by targeting the observed PDF rather than by merely minimizing the difference between predictions and observations. Reproducing the observed PDF as opposed to discrete value-matching makes training more resilient to experimental error and buffers the parameterization against skewness introduced by extreme, low-probability outlier events. The latter could be detrimental in atmospheric simulations, as they propagate to other parts of the system by modifying processes like cloud formation. The importance of reproducing the observed PDF is central to our approach. Except for Wnet-prior, all the models tested represent plausible alternative parameterizations of σw (Fig. 5). Figure 6, however, shows that only Wnet reproduces the observed distribution of σw.
It is worth noting that even though EMD is adversarially trained, it only marginally approximates the PDF better than the Transfer and Obs-only models. Our implementation of EMD defines it discretely over each minibatch, which may limit the variability to which the GAN is exposed. Additionally, the absence of spectral normalization in the EMD loss implementation might cause the ANN to be penalized too heavily when exploring a wider PDF. These factors likely contribute to the suboptimal performance of EMD as compared to Wnet. Our tests, however, underscore the critical role of defining appropriate loss functions in the success of the GAN-based approach (Pan et al. 2020). The selection of an appropriate functional form for the loss functions would be facilitated by the generalized formulation of the GAN equations introduced in section 3b. This formulation enables a more informed exploration of different loss functions to improve the ANN’s ability to capture and reproduce the observed PDF accurately.
The proposed method does not solely depend on the loss function to generalize the behavior of the parameterization. Instead, it leverages physical principles encoded by Wnet to guide extrapolation beyond the domain of the observations. Wnet-prior, which is pretrained on G5NR data, remains frozen during the refinement step and serves as a valuable feature extractor, capturing essential patterns and physics-based information from the simulation. By integrating this knowledge into Wnet, we can guide the parameterization to follow the learned physics from G5NR while refining the predictions based on observational data. This approach allows us to effectively combine the strengths of both G5NR and observational data, leading to an enhanced parameterization that can generalize well beyond the observed domain.
This is evident in Fig. 7, where the spatial distribution of σw predicted by Wnet shares many features with those shown in Fig. 4. On the other hand, the Obs-only model tends to exhibit less variability, and it is prone to predict high values of σw. For example, at 500 hPa, the Obs-only approach predicts a wide band of high σw covering most of the region between 60°S and 60°N. This may be a consequence of limited data, since only the MAO dataset has measurements at 500 hPa in the tropics. Wnet also predicts high σw in that region; however, it shows features associated with the presence of strong convection and a land–ocean contrast. As both features are evident in Wnet-prior (Fig. 4), it is likely that such a structure is associated with underlying physical constraints inherited by Wnet.
The impact of orography on σw is much more evident in Wnet compared to the Obs-only prediction. Encoding such an effect solely from ground-based data is challenging for an ANN model, as orography remains fixed at each site. The partial display of orographic features in the Obs-only model may result from concatenating the surface state to each level. It is plausible that a deeper, more intricate model architecture (e.g., based on convolutional layers) could learn the relationships provided by Wnet-prior directly from observational data. However, a sophisticated model may have limited applicability as a parameterization for GCMs due to potential computational expenses.
By design, σw predicted by Wnet is in good agreement with ground-based observations, as shown in Fig. 5. It is also within the range of reported in situ values taken by aircraft measurements (West et al. 2014). Besides reproducing field campaign data, a state-dependent parameterization can be applied globally to predict the global distribution of σw as shown in Fig. 7. There are, however, few reports on the spatial distribution of σw, particularly near the surface: almost all work is based on field campaign data and in situ analyses. Nevertheless, the predicted σw (Fig. 7, 900 hPa) shows expected features, with higher values over land than over ocean (Peng et al. 2005; Morales and Nenes 2010), and the well-known effect of wind-driven turbulence on σw (Bogenschutz et al. 2013), as, for example, in the storm tracks of the Southern Hemisphere around 40°S, evident as well in the Obs-only model. At higher levels, the distribution of σw predicted by Wnet agrees with published theoretical studies that focus on the effect of gravity waves and orography on wind variability (Dean et al. 2007; Joos et al. 2008; Barahona et al. 2017). Qualitatively, the global distribution of σw at 250 hPa shown in Fig. 6 closely resembles operational air turbulence products (Williams and Storer 2022; Sharman et al. 2006) raising the possibility that a real-time prediction of σw could complement the estimation of air turbulence indexes.
6. Conclusions
This work presents a novel approach to estimate the spatial standard deviation in vertical wind velocity, at scales typical of GCM simulations. The new parameterization results from the combination of global storm-resolving simulations, long-term observational data, and climate reanalysis products. In this way, it is constrained by the physical model driving the GSRM and by the observations. This is achieved by using a two-step technique where an ANN trained on the GSRM output is incorporated within a second, larger ANN model trained on the observational data. The new parameterization uses the meteorological state (winds, temperature, and water concentration) and coarse metrics of turbulence (Richardson number and scalar momentum diffusivity) to predict σw at each grid cell. The model introduced here is suitable to be used online within a GCM, or offline, driven by the output of real-time numerical weather forecasts.
Inclusion of observational data was critical to the performance of the new parameterization. Measurements from 11 stations around the world were used to develop the ANN, including new radar-derived data from two European sites (LEI and LIM). The ANN reproduces these measurements and generalizes well outside the data manifold. Previous work has focused on upper-tropospheric statistics, relevant to cirrus formation. Here, we extended the parameterization of σW to the surface, making it relevant not only to cloud formation but also to diagnose mixing within the PBL (Santanello et al. 2007), and even as a diagnostic tool for air travel safety (Williams and Storer 2022).
In developing the parameterization, emphasis was placed on reproducing the observed PDF of σW. This was achieved by using a conditional GAN algorithm to train the ANN against observations. Compared to direct training methods against the observational data, the constrained adversarial training algorithm results in a robust estimation of σW, and that reproduces the observed statistics. At the same time, the ANN parameterization of σW inherits spatial structure from the global storm-resolving simulation that it might not solely learn from observational data.
The ANN parameterization was designed specifically for host models operating at spatial resolutions coarser than approximately 25 km. As the resolution of the host model increases, it is expected that the contribution of the parameterized σW to the total W variability would diminish. However, the evident underestimation of σW by G5NR highlights the continued importance of a parameterization in most GCRMs. Scaling arguments (Barahona et al. 2017) could be employed to adapt the predictions of Wnet so that the parameterization remains effective across different spatial resolutions and can be seamlessly integrated into models operating at varying scales.
It would also be interesting to further elucidate the relative contributions of the observational data and the prior model to the final ANN. Besides σW, the general approach presented here is suitable to study and parameterize other variables, like, for example, cloud liquid and ice water, supercooled cloud fraction, and water vapor. The successful implementation of the parameterization would also rely on efficient Fortran libraries. Some potential projects are already underway to address this need (Curcic 2019; Ott et al. 2020). Future work would look to advance these topics. Using the tools of deep learning, this work for the first time leverages vertical air velocity data from different sources, that is, GSRMs, observations, and data assimilation, to generate a robust representation of subgrid-scale variability suitable for real-time and online atmospheric predictions.
Acknowledgments.
This work was supported by the NASA MEASURES Program WBS: 281945.02.31.04.39. K.H. Breen was supported by the NASA Postdoctoral Program Fellowship. The authors thank Moritz Hoffman for his input. The authors also thank Patrick Seifert and his team for the cloud radar data at Leipzig and Limassol. Resources supporting this work were provided by the NASA High-End Computing (HEC) Program through the NASA Center for Climate Simulation (NCCS) at Goddard Space Flight Center.
Data availability statement.
The GEOS-5 source code is available under the NASA Open Source Agreement at http://opensource.gsfc.nasa.gov/projects/GEOS-5/. All data generated in this work will be made publicly available through the NASA technical reports server (https://ntrs.nasa.gov) and PubSpace (https://www.nasa.gov/open/researchaccess/pubspace). The MERRA-2 Reanalysis and GEOS-5 nature run datasets are publicly available from https://disc.gsfc.nasa.gov/. Field campaign datasets were downloaded from the Atmospheric Radiation Measurement Archive at https://www.arm.gov/data/. Keras and Tensorflow libraries were obtained from https://keras.io/. Maps were created using the NCAR Command Language (version 6.6.2) software (2019). UCAR/NCAR/CISL/TDD: https://doi.org/10.5065/D6WD3XH5. All codes developed in this work are available under request.
APPENDIX
Training and Optimization
Wnet-prior was implemented as a stack of fully connected layers in an MLP architecture (Goodfellow et al. 2016). Due to the massive size of the G5NR output (about 17 000 half-hourly files), we developed a custom subsampling technique, randomly selecting, without replacement, a set of files (about 3) from the G5NR output, encompassing approximately 7.5 × 107 samples, and processing it for a few epochs (about 5), after which the entire training set was replaced. This adaptive approach, acting as a regularization method, allowed Wnet-prior to generalize the behavior of σW effectively and ensured robustness across varying environmental conditions. However, it also led to “jumps” in the loss every time a new set of files was loaded, although generally receding within one epoch. While other approaches, like generating ANN ensembles (Zhang and Ma 2012), may have potential benefits during training, they come with significant computational costs, making them less practical for operational use. Although our method worked well for our specific problem, it may require further refinement to be a general regularization method. Nevertheless, it provides an efficient and robust solution to parameterize σW using the vast G5NR dataset.
Wnet-prior was trained using the Keras library with Tensorflow backend (Chollet et al. 2015). Optimization was carried out with the Adam algorithm (Kingma and Ba 2014). Hyperparameter optimization was carried out using the Keras tuner (Chollet et al. 2015) and is summarized in Table A1. To optimize the model, each configuration was run three times for 50 epochs with the same set of G5NR files. The final chosen configuration was selected based on the lowest mean validation loss across the three runs per trial.
Parameters used during hyperparameter tuning for Wnet-prior. Optimal hyperparameters are shown in bold.
The final architecture selected for Wnet-prior used an MLP composed of five fully connected layers of 128 nodes each, with a single-node output layer (Fig. 2). Activation of the hidden layers used the leaky rectified linear unit activation function (Leaky ReLU; Maas et al. 2013). The input to the ANN was standardized using fixed global means and standard deviations from G5NR, calculated over 100 randomly selected half-hourly output files. Using a batch size of 2048 samples, convergence (in terms of
Refinement step
To train the cGAN, input from the MERRA-2 reanalysis was collocated in time and space for each of the datasets of Table 1 and used to drive the generator. MERRA-2 is highly constrained by conventional data assimilation techniques and represents the best approximation to the actual environmental state for each measurement. Optimization was carried out with the Adam algorithm (Kingma and Ba 2014) and using the binary cross entropy as loss function. The leaky ReLU activation function was used for the hidden layers (Agarap 2018; Maas et al. 2013). The output layer of the discriminator used sigmoidal activation. Dropout using a frequency rate of 0.3 (discriminator only) was performed before the last hidden layer to avoid overfitting (Srivastava et al. 2014). The discriminator inputs a 15-dimensional vector (14 input variables, plus σw) and can be updated several times for each update of the generator; in this work, however, it is done once per iteration. Allowing the discriminator to train multiple times for each generator update generally had a negative impact on the model’s performance. The exact reason behind this observation remains unclear and requires further investigation.
To optimize the GAN, various experiments were conducted by exploring different settings for the distance function [Eqs. (3) and (4)], the prior model, discriminator layers, batch size, learning rate, and data augmentation factor (i.e., the number of times certain sites were repeated to balance the training set). We trained each model for a fixed number of epochs (approximately 500) and selected the weights at the epoch with the lowest mean square error against observations. Due to the nonsmooth nature of the cGAN loss function, which depends on the interaction between the generator and discriminator, formal parameter search was challenging. Therefore, the selection of the best model was guided by expert knowledge, considering not only the error against observations but also the overall behavior of the global distribution of σw. The resulting PDF for the selected model at each individual site is presented in Fig. A1.
REFERENCES
Agarap, A. F., 2018: Deep learning using rectified linear units (ReLU). arXiv, 1803.08375v2, https://doi.org/10.48550/arXiv.1803.08375.
Arjovsky, M., S. Chintala, and L. Bottou, 2017: Wasserstein generative adversarial networks. ICML’17: Proc. Int Conf. on Machine Learning, Sydney, NSW, Australia, PMLR, 214–223, https://dl.acm.org/doi/abs/10.5555/3305381.3305404.
Barahona, D., A. Molod, J. Bacmeister, A. Nenes, A. Gettelman, H. Morrison, V. Phillips, and A. Eichmann, 2014: Development of two-moment cloud microphysics for liquid and ice within the NASA Goddard Earth Observing System Model (GEOS-5). Geosci. Model Dev., 7, 1733–1766, https://doi.org/10.5194/gmd-7-1733-2014.
Barahona, D., A. Molod, and H. Kalesse, 2017: Direct estimation of the global distribution of vertical velocity within cirrus clouds. Sci. Rep., 7, 6840, https://doi.org/10.1038/s41598-017-07038-6.
Bauer, P., A. Thorpe, and G. Brunet, 2015: The quiet revolution of numerical weather prediction. Nature, 525, 47–55, https://doi.org/10.1038/nature14956.
Berg, L. K., R. K. Newsom, and D. D. Turner, 2017: Year-long vertical velocity statistics derived from Doppler lidar data for the continental convective boundary layer. J. Appl. Meteor. Climatol., 56, 2441–2454, https://doi.org/10.1175/JAMC-D-16-0359.1.
Berthelot, D., T. Schumm, and L. Metz, 2017: BEGAN: Boundary Equilibrium Generative Adversarial Networks. arXiv, 1703.10717v4, https://doi.org/10.48550/arXiv.1703.10717.
Besombes, C., O. Pannekoucke, C. Lapeyre, B. Sanderson, and O. Thual, 2021: Producing realistic climate data with generative adversarial networks. Nonlinear Processes Geophys., 28, 347–370, https://doi.org/10.5194/npg-28-347-2021.
Beucler, T., M. Pritchard, S. Rasp, J. Ott, P. Baldi, and P. Gentine, 2021: Enforcing analytic constraints in neural networks emulating physical systems. Phys. Rev. Lett., 126, 098302, https://doi.org/10.1103/PhysRevLett.126.098302.
Bihlo, A., 2021: A generative adversarial network approach to (ensemble) weather prediction. Neural Networks, 139, 1–16, https://doi.org/10.1016/j.neunet.2021.02.003.
Billault-Roux, A.-C., G. Ghiggi, L. Jaffeux, A. Martini, N. Viltard, and A. Berne, 2023: Dual-frequency spectral radar retrieval of snowfall microphysics: A physics-driven deep-learning approach. Atmos. Meas. Tech., 16, 911–940, https://doi.org/10.5194/amt-16-911-2023.
Bogenschutz, P. A., A. Gettelman, H. Morrison, V. E. Larson, C. Craig, and D. P. Schanen, 2013: Higher-order turbulence closure and its impact on climate simulations in the Community Atmosphere Model. J. Climate, 26, 9655–9676, https://doi.org/10.1175/JCLI-D-13-00075.1.
Bühl, J., and Coauthors, 2013: LACROS: The Leipzig Aerosol and Cloud Remote Observations System. Proc. SPIE, 8890, 889002, https://doi.org/10.1117/12.2030911.
Chi, M., A. Plaza, J. A. Benediktsson, Z. Sun, J. Shen, and Y. Zhu, 2016: Big data for remote sensing: Challenges and opportunities. Proc. IEEE, 104, 2207–2219, https://doi.org/10.1109/JPROC.2016.2598228.
Chin, M., and Coauthors, 2002: Tropospheric aerosol optical thickness from the GOCART model and comparisons with satellite and sun photometer measurements. J. Atmos. Sci., 59, 461–483, https://doi.org/10.1175/1520-0469(2002)059<0461:TAOTFT>2.0.CO;2.
Chollet, F., and Coauthors, 2015: Keras. GitHub, https://github.com/fchollet/keras.
Colarco, P., A. da Silva, M. Chin, and T. Diehl, 2010: Online simulations of global aerosol distributions in the NASA GEOS-4 model and comparisons to satellite and ground-based aerosol optical depth. J. Geophys. Res., 115, D14207, https://doi.org/10.1029/2009JD012820.
Creswell, A., T. White, V. Dumoulin, K. Arulkumaran, B. Sengupta, and A. A. Bharath, 2018: Generative adversarial networks: An overview. IEEE Signal Process. Mag., 35, 53–65, https://doi.org/10.1109/MSP.2017.2765202.
Curcic, M., 2019: A parallel Fortran framework for neural networks and deep learning. ACM SIGPLAN Fortran Forum, New York, NY, Association for Computing Machinery, 4–21, https://dl.acm.org/doi/abs/10.1145/3323057.3323059.
Danabasoglu, G., and Coauthors, 2020: The Community Earth System Model version 2 (CESM2). J. Adv. Model. Earth Syst., 12, e2019MS001916, https://doi.org/10.1029/2019MS001916.
Daw, A., A. Karpatne, W. Watkins, J. Read, and V. Kumar, 2017: Physics-Guided Neural Networks (PGNN): An application in lake temperature modeling. arXiv, 1710.11431v3, https://doi.org/10.48550/arXiv.1710.11431.
Dean, S. M., J. Flowerdew, B. N. Lawrence, and S. D. Eckermann, 2007: Parameterisation of orographic cloud dynamics in a GCM. Climate Dyn., 28, 581–597, https://doi.org/10.1007/s00382-006-0202-0.
Fudeyasu, H., Y. Wang, M. Satoh, T. Nasuno, H. Miura, and W. Yanase, 2008: Global cloud-system-resolving model NICAM successfully simulated the lifecycles of two real tropical cyclones. Geophys. Res. Lett., 35, L22808, https://doi.org/10.1029/2008GL036003.
Gelaro, R., and Coauthors, 2015: Evaluation of the 7-km GEOS-5 Nature Run. NASA Tech. Rep., NASA/TM2014-104606/Vol.36, 305 pp., https://ntrs.nasa.gov/api/citations/20150011486/downloads/20150011486.pdf.
Gelaro, R., and Coauthors, 2017: The Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA-2). J. Climate, 30, 5419–5454, https://doi.org/10.1175/JCLI-D-16-0758.1.
Gettelman, A., D. J. Gagne, C.-C. Chen, M. W. Christensen, Z. J. Lebo, H. Morrison, and G. Gantos, 2021: Machine learning the warm rain process. J. Adv. Model. Earth Syst., 13, e2020MS002268, https://doi.org/10.1029/2020MS002268.
Ghan, S. J., L. R. Leung, R. C. Easter, and H. Abdul-Razzak, 1997: Prediction of cloud droplet number in a general circulation model. J. Geophys. Res., 102, 21 777–21 794, https://doi.org/10.1029/97JD01810.
Giangrande, S. E., and Coauthors, 2016: Convective cloud vertical velocity and mass-flux characteristics from radar wind profiler observations during GoAmazon2014/5. J. Geophys. Res. Atmos., 121, 12 891–12 913, https://doi.org/10.1002/2016JD025303.
Goodfellow, I. J., J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, 2014: Generative adversarial nets. NIPS’14: Proc. 27th Int. Conf. on Neural Information Processing Systems, Montreal, QC, Canada, Association for Computing Machinery, 2672–2680, https://dl.acm.org/doi/10.5555/2969033.2969125.
Goodfellow, I. J., Y. Bengio, and A. Courville, 2016: Deep Learning. MIT Press, 800 pp.
Guo, Z., and Coauthors, 2014: A sensitivity analysis of cloud properties to CLUBB parameters in the Single-column Community Atmosphere Model (SCAM5). J. Adv. Model. Earth Syst., 6, 829–858, https://doi.org/10.1002/2014MS000315.
Iglesias-Suarez, F., P. Gentine, B. Solino-Fernandez, T. Beucler, M. Pritchard, J. Runge, and V. Eyring, 2023: Causally-informed deep learning to improve climate models and projections. arXiv, 2304.12952v3, https://doi.org/10.48550/arXiv.2304.12952.
Illingworth, A. J., and Coauthors, 2007: Cloudnet: Continuous evaluation of cloud profiles in seven operational models using ground-based observations. Bull. Amer. Meteor. Soc., 88, 883–898, https://doi.org/10.1175/BAMS-88-6-883.
IPCC, 2013: Climate Change 2013: The Physical Science Basis. Cambridge University Press, 1535 pp., https://doi.org/10.1017/CBO9781107415324.
Johnson, S. J., and Coauthors, 2019: SEAS5: The new ECMWF seasonal forecast system. Geosci. Model Dev., 12, 1087–1117, https://doi.org/10.5194/gmd-12-1087-2019.
Joos, H., P. Spichtinger, U. Lohmann, J.-F. Gayet, and A. Minikin, 2008: Orographic cirrus in the global climate model ECHAM5. J. Geophys. Res., 113, D18205, https://doi.org/10.1029/2007JD009605.
Judt, F., and Coauthors, 2021: Tropical cyclones in global storm-resolving models. J. Meteor. Soc. Japan, 99, 579–602, https://doi.org/10.2151/jmsj.2021-029.
Kalesse, H., and P. Kollias, 2013: Climatology of high cloud dynamics using profiling ARM Doppler radar observations. J. Climate, 26, 6340–6359, https://doi.org/10.1175/JCLI-D-12-00695.1.
Kärcher, B., and A. Podglajen, 2019: A stochastic representation of temperature fluctuations induced by mesoscale gravity waves. J. Geophys. Res. Atmos., 124, 11 506–11 529, https://doi.org/10.1029/2019JD030680.
Kingma, D. P., and J. Ba, 2014: Adam: A method for stochastic optimization. arXiv, 1412.6980v9, https://doi.org/10.48550/arXiv.1412.6980.
Laney, D., 2001: 3D data management: Controlling data volume, velocity and variety. META Group Research Note 6, 1 pp.
LeCun, Y., Y. Bengio, and G. Hinton, 2015: Deep learning. Nature, 521, 436–444, https://doi.org/10.1038/nature14539.
Leinonen, J., A. Guillaume, and T. Yuan, 2019: Reconstruction of cloud vertical structure with a generative adversarial network. Geophys. Res. Lett., 46, 7035–7044, https://doi.org/10.1029/2019GL082532.
Lenschow, D. H., M. Lothon, S. D. Mayor, P. P. Sullivan, and G. Canut, 2012: A comparison of higher-order vertical velocity moments in the convective boundary layer from lidar with in situ measurements and large-eddy simulation. Bound.-Layer Meteor., 143, 107–123, https://doi.org/10.1007/s10546-011-9615-3.
Liu, J., J. Tian, Z. Liu, T. D. Herbert, A. V. Fedorov, and M. Lyle, 2019: Eastern equatorial Pacific cold tongue evolution since the late Miocene linked to extratropical climate. Sci. Adv., 5, eaau6060, https://doi.org/10.1126/sciadv.aau6060.
Lopez-Gomez, I., Y. Cohen, J. He, A. Jaruga, and T. Schneider, 2020: A generalized mixing length closure for eddy-diffusivity mass-flux schemes of turbulence and convection. J. Adv. Model. Earth Syst., 12, e2020MS002161, https://doi.org/10.1029/2020MS002161.
Lopez-Gomez, I., C. Christopoulos, H. L. Langeland Ervik, O. R. Dunbar, Y. Cohen, and T. Schneider, 2022: Training physics-based machine-learning parameterizations with gradient-free ensemble Kalman methods. J. Adv. Model. Earth Syst., 14, e2022MS003105, https://doi.org/10.1029/2022MS003105.
Maas, A. L., A. Y. Hannun, and A. Y. Ng, 2013: Rectifier nonlinearities improve neural network acoustic models. Proc. 30th Int. Conf. on Machine Learning, Atlanta, GA, JMLR, 6 pp., https://ai.stanford.edu/∼amaas/papers/relu_hybrid_icml2013_final.pdf.
McFarlane, N. A., 1987: The effect of orographically excited gravity wave drag on the general circulation of the lower stratosphere and troposphere. J. Atmos. Sci., 44, 1775–1800, https://doi.org/10.1175/1520-0469(1987)044%3C1775:TEOOEG%3E2.0.CO;2.
Mirza, M., and S. Osindero, 2014: Conditional generative adversarial nets. arXiv, 1411.1784v1, https://doi.org/10.48550/ARXIV.1411.1784.
Miyato, T., T. Kataoka, M. Koyama, and Y. Yoshida, 2018: Spectral normalization for generative adversarial networks. arXiv, 1802.05957v1, https://doi.org/10.48550/arXiv.1802.05957.
Molod, A., and Coauthors, 2020: GEOS-S2S Version 2: The GMAO high-resolution coupled model and assimilation system for seasonal prediction. J. Geophys. Res. Atmos., 125, e2019JD031767, https://doi.org/10.1029/2019JD031767.
Mooers, G., M. Pritchard, T. Beucler, J. Ott, G. Yacalis, P. Baldi, and P. Gentine, 2020: Assessing the potential of deep learning for emulating cloud superparameterization in climate models with real-geography boundary conditions. arXiv, 2010.12996v3, https://doi.org/10.48550/arXiv.2010.12996.
Morales, R., and A. Nenes, 2010: Characteristic updrafts for computing distribution-averaged cloud droplet number, and stratocumulus cloud properties. J. Geophys. Res., 115, D18220, https://doi.org/10.1029/2009JD013233.
Morrison, H., J. A. Curry, and V. I. Khvorostyanov, 2005: A new double-moment microphysics parameterization for application in cloud and climate models. Part I: Description. J. Atmos. Sci., 62, 1665–1677, https://doi.org/10.1175/JAS3446.1.
Newsom, R. K., C. Sivaraman, T. R. Shippert, and L. D. Riihimaki, 2019: Doppler lidar vertical velocity statistics value-added product. Rep. DOE/SC-ARM-TR-149, 22 pp., https://www.arm.gov/publications/tech_reports/doe-sc-arm-tr-149.pdf?id=1000.
Ott, J., M. Pritchard, N. Best, E. Linstead, M. Curcic, and P. Baldi, 2020: A Fortran-Keras deep learning bridge for scientific computing. Sci. Program., 2020, 8888811, https://doi.org/10.1155/2020/8888811.
Pan, Z., W. Yu, B. Wang, H. Xie, V. S. Sheng, J. Lei, and S. Kwong, 2020: Loss functions of Generative Adversarial Networks (GANs): Opportunities and challenges. IEEE Trans. Emerging Top. Comput. Intell., 4, 500–522, https://doi.org/10.1109/TETCI.2020.2991774.
Patnaude, R., M. Diao, X. Liu, and S. Chu, 2021: Effects of thermodynamics, dynamics and aerosols on cirrus clouds based on in situ observations and NCAR CAM6. Atmos. Chem. Phys., 21, 1835–1859, https://doi.org/10.5194/acp-21-1835-2021.
Peng, Y., U. Lohmann, and R. Leaitch, 2005: Importance of vertical velocity variations in the cloud droplet nucleation process of marine stratus clouds. J. Geophys. Res., 110, D21213, https://doi.org/10.1029/2004JD004922.
Pruppacher, H. R., and J. D. Klett, 1997: Microphysics of Clouds and Precipitation. 2nd ed. Kluwer Academic, 954 pp.
Putman, W. M., and M. Suarez, 2011: Cloud-system resolving simulations with the NASA Goddard Earth Observing System global atmospheric model (GEOS-5). Geophys. Res. Lett., 38, L16809, https://doi.org/10.1029/2011GL048438.
Putman, W. M., M. Suarez, and A. Trayanov, 2015: 1.5-km global cloud-resolving simulations with GEOS-5. NASA, https://gmao.gsfc.nasa.gov/research/science_snapshots/1.5km_cloud_simulation.php.
Radford, A., L. Metz, and S. Chintala, 2015: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv, 1511.06434v2, https://doi.org/10.48550/arXiv.1511.06434.
Randles, C. A., and Coauthors, 2017: The MERRA-2 aerosol reanalysis, 1980 onward. Part I: System description and data assimilation evaluation. J. Climate, 30, 6823–6850, https://doi.org/10.1175/JCLI-D-16-0609.1.
Rasp, S., M. S. Pritchard, and P. Gentine, 2018: Deep learning to represent subgrid processes in climate models. Proc. Natl. Acad. Sci. USA, 115, 9684–9689, https://doi.org/10.1073/pnas.1810286115.
Reynolds, R. W., N. A. Rayner, T. M. Smith, D. C. Stokes, and W. Wang, 2002: An improved in situ and satellite SST analysis for climate. J. Climate, 15, 1609–1625, https://doi.org/10.1175/1520-0442(2002)015<1609:AIISAS>2.0.CO;2.
Rienecker, M. M., and Coauthors, 2008: The GEOS-5 Data Assimilation System—Documentation of Versions 5.0.1, 5.1.0, and 5.2.0. Tech. Memo. NASA/TM-2008-104606, Vol. 27, 97 pp., http://gmao.gsfc.nasa.gov/pubs/docs/Rienecker369.pdf.
Roh, W., M. Satoh, and C. Hohenegger, 2021: Intercomparison of cloud properties in DYAMOND simulations over the Atlantic Ocean. J. Meteor. Soc. Japan, 99, 1439–1451, https://doi.org/10.2151/jmsj.2021-070.
Röttenbacher, J., 2021: Further development of an algorithm to determine cirrus cloud dynamics. M.S. thesis, Institute for Meteorology, Leipzig University, 159 pp.
Satoh, M., B. Stevens, F. Judt, M. Khairoutdinov, S.-J. Lin, W. M. Putman, and P. Düben, 2019: Global cloud-resolving models. Curr. Climate Change Rep., 5, 172–184, https://doi.org/10.1007/s40641-019-00131-0.
Schmidhuber, J., 2015: Deep learning in neural networks: An overview. Neural Networks, 61, 85–117, https://doi.org/10.1016/j.neunet.2014.09.003.
Seinfeld, J. H., and Coauthors, 2016: Improving our fundamental understanding of the role of aerosol cloud interactions in the climate system. Proc. Natl. Acad. Sci. USA, 113, 5781–5790, https://doi.org/10.1073/pnas.1514043113.
Sharman, R., C. Tebaldi, G. Wiener, and J. Wolff, 2006: An integrated approach to mid-and upper-level turbulence forecasting. Wea. Forecasting, 21, 268–287, https://doi.org/10.1175/WAF924.1.
Shi, X., and X. Liu, 2016: Effect of cloud-scale vertical velocity on the contribution of homogeneous nucleation to cirrus formation and radiative forcing. Geophys. Res. Lett., 43, 6588–6595, https://doi.org/10.1002/2016GL069531.
Srivastava, N., G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, 2014: Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res., 15, 1929–1958, https://dl.acm.org/doi/abs/10.5555/2627435.2670313.
Stevens, B., and Coauthors, 2019: DYAMOND: The Dynamics of the Atmospheric General Circulation Modeled on Non-hydrostatic Domains. Prog. Earth Planet. Sci., 6, 61, https://doi.org/10.1186/s40645-019-0304-z.
Stull, R. B., 1988: An Introduction to Boundary Layer Meteorology. Kluwer Academic, 666 pp.
Sullivan, S. C., D. Lee, L. Oreopoulos, and A. Nenes, 2016: Role of updraft velocity in temporal variability of global cloud hydrometeor number. Proc. Natl. Acad. Sci. USA, 113, 5791–5796, https://doi.org/10.1073/pnas.1514039113.
Tan, I., and D. Barahona, 2022: The impacts of immersion ice nucleation parameterizations on Arctic mixed-phase stratiform cloud properties and the Arctic radiation budget in GEOS-5. J. Climate, 35, 4049–4070, https://doi.org/10.1175/JCLI-D-21-0368.1.
Terai, C. R., M. S. Pritchard, P. Blossey, and C. Bretherton, 2020: The impact of resolving subkilometer processes on aerosol-cloud interactions of low-level clouds in global model simulations. J. Adv. Model. Earth Syst., 12, e2020MS002274, https://doi.org/10.1029/2020MS002274.
West, R. E. L., P. Stier, A. Jones, C. E. Johnson, G. W. Mann, N. Bellouin, D. Partridge, and Z. Kipling, 2014: The importance of vertical velocity variability for estimates of the indirect aerosol effects. Atmos. Chem. Phys., 14, 6369–6393, https://doi.org/10.5194/acp-14-6369-2014.
Wilks, D. S., 2011: Statistical Methods in the Atmospheric Sciences. 3rd ed. Elsevier, 676 pp.
Willard, J., X. Jia, S. Xu, M. Steinbach, and V. Kumar, 2020: Integrating physics-based modeling with machine learning: A survey. arXiv, 2003.04919v4, https://doi.org/10.48550/arXiv.2003.04919.
Williams, P. D., and L. N. Storer, 2022: Can a climate model successfully diagnose clear-air turbulence and its response to climate change? Quart. J. Roy. Meteor. Soc., 148, 1424–1438, https://doi.org/10.1002/qj.4270.
Zeng, Y., J.-L. Wu, and H. Xiao, 2021: Enforcing imprecise constraints on generative adversarial networks for emulating physical systems. Commun. Comput. Phys., 30, 635–665, https://doi.org/10.4208/cicp.OA-2020-0106.
Zhang, C., and Y. Ma, 2012: Ensemble Machine Learning: Methods and Applications. Springer, 332 pp.
Zhu, J.-Y., T. Park, P. Isola, and A. A. Efros, 2017: Unpaired image-to-image translation using cycle-consistent adversarial networks. Proc. IEEE Int. Conf. on Computer Vision, Venice, Italy, Institute of Electrical and Electronics Engineers, 2242–2251, https://doi.org/10.1109/ICCV.2017.244.