Parameterizing Raindrop Formation Using Machine Learning

Azusa Takeishi aLaboratoire d’Aérologie, UPS/CNRS, Toulouse, France

Search for other papers by Azusa Takeishi in
Current site
Google Scholar
PubMed
Close
and
Chien Wang aLaboratoire d’Aérologie, UPS/CNRS, Toulouse, France

Search for other papers by Chien Wang in
Current site
Google Scholar
PubMed
Close
Open access

Abstract

Raindrop formation processes in warm clouds mainly consist of condensation and collision–coalescence of small cloud droplets. Once raindrops form, they can continue growing through collection of cloud droplets and self-collection. In this study, we develop novel emulators to represent raindrop formation as a function of various physical or background environmental conditions by using a sophisticated aerosol–cloud model containing 300 droplet size bins and machine learning methods. The emulators are then implemented in two microphysics schemes in the Weather Research and Forecasting Model and tested in two idealized cases. The simulations of shallow convection with the emulators show a clear enhancement of raindrop formation compared to the original simulations, regardless of the scheme in which they were embedded. On the other hand, the simulations of deep convection show a more complex response to the implementation of the emulators, in terms of the changes in the amount of rainfall, due to the larger number of microphysical processes involved in the cloud system (i.e., ice-phase processes). Our results suggest the potential of emulators to replace the conventional parameterizations, which may allow us to improve the representation of physical processes at an affordable computational expense.

Significance Statement

Formation of raindrops marks a critical stage in cloud evolution. Accurate representations of raindrop formation processes require detailed calculations of cloud droplet growth processes. These calculations are often not affordable in weather and climate models as they are computationally expensive due to their complex dependence on cloud droplet size distributions and dynamical conditions. As a result, simplified parameterizations are more frequently used. In our study we trained machine learning models to learn raindrop formation rates from detailed calculations of cloud droplet evolutions in 1000 parcel-model simulations. The implementation of the developed models or the emulators in a weather forecasting model shows a change in the total rainfall and cloud characteristics, indicating the potential improvement of cloud representations in models if these emulators replace the conventional parameterizations.

© 2024 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Takeishi’s current affiliation: SPEC Inc., Boulder, Colorado

Corresponding author: Azusa Takeishi, azusa.takeishi@aero.obs-mip.fr

Abstract

Raindrop formation processes in warm clouds mainly consist of condensation and collision–coalescence of small cloud droplets. Once raindrops form, they can continue growing through collection of cloud droplets and self-collection. In this study, we develop novel emulators to represent raindrop formation as a function of various physical or background environmental conditions by using a sophisticated aerosol–cloud model containing 300 droplet size bins and machine learning methods. The emulators are then implemented in two microphysics schemes in the Weather Research and Forecasting Model and tested in two idealized cases. The simulations of shallow convection with the emulators show a clear enhancement of raindrop formation compared to the original simulations, regardless of the scheme in which they were embedded. On the other hand, the simulations of deep convection show a more complex response to the implementation of the emulators, in terms of the changes in the amount of rainfall, due to the larger number of microphysical processes involved in the cloud system (i.e., ice-phase processes). Our results suggest the potential of emulators to replace the conventional parameterizations, which may allow us to improve the representation of physical processes at an affordable computational expense.

Significance Statement

Formation of raindrops marks a critical stage in cloud evolution. Accurate representations of raindrop formation processes require detailed calculations of cloud droplet growth processes. These calculations are often not affordable in weather and climate models as they are computationally expensive due to their complex dependence on cloud droplet size distributions and dynamical conditions. As a result, simplified parameterizations are more frequently used. In our study we trained machine learning models to learn raindrop formation rates from detailed calculations of cloud droplet evolutions in 1000 parcel-model simulations. The implementation of the developed models or the emulators in a weather forecasting model shows a change in the total rainfall and cloud characteristics, indicating the potential improvement of cloud representations in models if these emulators replace the conventional parameterizations.

© 2024 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Takeishi’s current affiliation: SPEC Inc., Boulder, Colorado

Corresponding author: Azusa Takeishi, azusa.takeishi@aero.obs-mip.fr

1. Introduction

Among a number of microphysical processes in clouds, the formation of raindrops is one of the most fundamental processes that determine the overall characteristics of clouds and their impacts on weather and climate. While the formation of millimeter-size raindrops can take place via a few different microphysical processes, their very first formation in warm clouds is predominantly via the growth of micrometer-size cloud droplets. In particular, two processes are essential for the growth of cloud droplets: condensation and collision–coalescence. The former is particularly important for relatively small droplets, whereas the latter dominates for larger droplets (Rogers and Yau 1989). For instance, cloud droplets would first need to grow by condensation up to about 20–30 μm, above which collision–coalescence dominates (Rogers and Yau 1989; Xue et al. 2008). In bulk or modal microphysical models where cloud droplets and raindrops are clearly separated and their size distributions are represented by probability distribution functions, the droplet-to-raindrop conversion is often referred to as autoconversion. Under the same assumption on the separation between cloud droplets and raindrops in bulk models, accretion and self-collection refer to the collection of cloud droplets by raindrops and that of raindrops by raindrops, respectively. These processes are parameterized in most numerical models except for those using bin microphysical models or those without a separation of cloud droplets and raindrops (e.g., Kogan and Belochitski 2012; Igel et al. 2022).

Hsieh et al. (2009) compared eight different parameterizations for the autoconversion process and evaluated them against values from physically based calculations using observational data from two field campaigns. As expected, they found a large difference in the predicted autoconversion rates among different parameterizations, as well as varying consistency depending on field campaigns and targeted variables. Recently, Lee and Baik (2017) have summarized four major types of parameterizations for the autoconversion process in models. Among the four, those obtained by solving the kinetic collection equation (KCE) or approximating its calculations would theoretically provide the most “accurate” autoconversion rates while there may be exceptions (e.g., Seifert and Rasp 2020), even though these computationally expensive calculations are often not affordable in weather forecasting models, let alone in climate models. They introduced a new KCE-based parameterization for the process, which provided a promising result that resembles the direct KCE solver.

While the pursuit of a more accurate autoconversion parameterization continues in modeling studies, the role of turbulence in enhancing the autoconversion process has been long discussed. As cloud droplets mostly grow in turbulent environments, any potential impacts need to be taken into account in addition to KCE. For instance, turbulence can potentially impact droplets’ condensational growth by creating heterogeneous supersaturation conditions that may allow some droplets to grow faster than others, resulting in a wider droplet spectrum that better matches observations (Rogers and Yau 1989). Xue et al. (2008) summarized that turbulence can enhance collision–coalescence by 1) changing relative velocities of droplets, 2) clustering them, 3) altering their settling rates, and 4) enhancing the collision efficiency. They compared four turbulent collision kernels that include some of the turbulence effects above, as opposed to a geometric or gravitational collision kernel, and found a strong enhancement of droplet growth by turbulence and a faster formation of drizzle drops as a result. Chen et al. (2016) used direct numerical simulations (DNSs) that can accurately track the growth of cloud droplets over a centimeter-scale domain, based on which they introduced a new collision kernel that includes the turbulence effects. Likewise, Chen et al. (2018a) introduced turbulent collision efficiencies based on their DNS results. They later found that the narrowing of drop size distributions by condensation sets up a condition for an effective enhancement of collision–coalescence by turbulence (Chen et al. 2018b).

Over the past few years, the applicability of machine learning (ML) techniques to atmospheric science research has been progressively explored. Some studies used simulations of cloud-resolving models (CRMs) as a “truth” and trained ML models with results from CRMs in order to emulate the subgrid-scale processes in climate models where high-resolution simulations are mostly not affordable (e.g., Krasnopolsky et al. 2013; Gentine et al. 2018; Rasp et al. 2018). The resultant emulators in these studies successfully reproduced some fine-scale features at a lower computational cost than running CRMs. These studies have highlighted the potential for climate simulations that are computationally affordable yet as accurate as CRMs, concurrently with the ongoing developments and efforts toward long-term global CRMs over the past few decades. Likewise, some other studies focused on improving the predictions of a part of the climate system via ML, such as typhoon tracks (Rüttgers et al. 2019), SST in the tropics (Zheng et al. 2020), and apparent sources of heat and moisture (Brenowitz and Bretherton 2018) in the tropics. O’Gorman and Dwyer (2018) machine learned a parameterization of moist convection and showed the applicability of their methods and the utility of their emulator in a climate model, as it was able to reproduce the climate simulation with the moist-convection parameterization fairly well. ML methods may be especially useful for the predictions of severe weather events, as McGovern et al. (2017) summarized some methods and utilities. For example, Mostajabi et al. (2019) and Kamangir et al. (2020) applied the ML methods to predict the occurrences of lightning within local areas from either observed or simulated meteorological fields. Similarly, Lagerquist et al. (2017) explored the predictability of strong winds from convective storms based on observational datasets, comparing multiple ML methods or emulator structures. Herman and Schumacher (2018) also predicted extreme precipitation over the United States from ensemble model data via ML, which indeed outperformed two ensemble forecasts.

In recent years, applications of the ML methods have been extended to calculations of cloud microphysical processes. Processes related to raindrop formation have been of particular interest due to their importance in determining the overall cloud developments and the total amounts of rainfall. Seifert and Rasp (2020), for instance, took the Monte Carlo superdroplet method (Shima et al. 2009) to solve the KCE, whose results were used for training an ML model for each of four warm-rain microphysical processes individually. They showed that the ML-based predictions had a reasonable accuracy, as compared to existing parameterizations for the corresponding processes, although combining their ML-based items to solve ordinary differential equations that describe hydrometeors’ time evolution did not outperform certain existing parameterizations. In Chiu et al. (2021), in situ droplet size distribution (DSD) data from a field campaign were used to initialize KCE calculations for obtaining autoconversion and accretion rates, which were later machine learned. The resultant emulators predicted both rates quite well, and subsequently, a simplified parameterization for each process was developed, with a finding of an unexpected dependency of autoconversion rates on drizzle number concentrations Nr. A recent study by Rodríguez Genó and Alfonso (2022) utilized ML for predicting total moment tendencies during collision–coalescence processes, using six parameters as inputs that characterize DSDs, which are composed of two lognormal size distributions. Each ML model trained for each order of moment attained a good prediction skill and was in turn used for deriving DSDs. The predictions by the emulators outperformed those by an existing bulk parameterization especially when the temporal evolutions of DSDs were compared. Gettelman et al. (2021) trained ML models with raindrop production rates according to bin-microphysical calculations and then implemented them in a climate model. They showed the high accuracy of the predictions by the emulators in comparison to the original bin-model calculations; zonal mean climate variables in those ML simulations were shown to follow those in the bin model quite well, although the bin model results did not necessarily always match with the observations for some variables at certain latitude ranges. One of the largest advantages of using such a method is the high accuracy attained at a much lower computational cost. In their simulations, the cost of running an ML simulation was equivalent to that of running a default climate simulation.

This study continues the exploration of ML methods to ameliorate cloud microphysical parameterizations in models, particularly in regional-scale models. We use 300-bin-based rigorous microphysical calculations to train ML models, which are later implemented in a weather forecasting model. Our bin-model simulations do not initiate with cloud droplets that follow a commonly adopted gamma distribution in most of the studies above. Instead, these simulations start with a subsaturated condition where two types of aerosols later get activated in a rising parcel to form cloud droplets in a realistic size distribution. Furthermore, any warm clouds with positive updrafts are the target of this work, including both convective and stratiform, regardless of their developmental stages, locations, or spatial scales as long as the conditions fall within the training range. This wide range of targets is one of the strengths and uniqueness of this study and is also a reason why the emulators require not only droplet variables but also dynamical variables such as updrafts, supersaturations, and eddy dissipation rates. Such a mesoscale application of ML methods enables us to extend our understanding of the ML applicability to future weather forecasting. Therefore, the primary objective of this work is to utilize the ML methods for establishing emulators that can be in turn used for the microphysical process calculations of warm clouds under a wide range of background conditions in a weather forecasting model.

The rest of the paper is organized as follows: section 2 presents the methods employed in this study that include the settings of the bin-model simulations (section 2a), the ML procedures (section 2b), and the mesoscale simulations combining the two (section 2c). Section 3 shows and discusses the results from each step in section 2, separated into sections 3a–c, followed by the conclusions in section 4.

2. Methods

In this section, we describe the settings of the parcel model simulations (section 2a), the ML procedures that use the simulation results (section 2b), and the simulation settings for a weather forecasting model in which the emulators are implemented (section 2c).

a. Parcel-model simulations

The parcel model named Pyrcel (Rothenberg and Wang 2016) calculates the time evolution of aerosol size distributions in a rising air parcel where the bin microphysical calculations are made. The parcel rises at a constant vertical velocity set by a user, which eventually creates a supersaturated condition for aerosols to grow in size through condensation. We have added the collision–coalescence process to the original Pyrcel model, following the KCE framework described by Bott (1998). The calculation of a terminal fall velocity is based on Beard (1976). To account for the enhancement of collisions by turbulence effects (e.g., Xue et al. 2008), the hydrodynamic kernel used in Bott (1998) [see their Eq. (26)] was replaced by the “turbulent collision kernel” in Chen et al. (2016), and the collision efficiency by Long (1974) used in Bott (1998) was replaced by the “turbulent collision efficiency” in Chen et al. (2018a) for the size range of 10 ≤ r ≤ 25 μm (solid lines in Fig. 1d). To have a smooth transition into the original kernel outside of this range, the results from Chen et al. (2016, 2018a) and those from the original kernel were linearly interpolated within a buffer zone (5 ≤ rcollector ≤ 30 μm and rcollected/rcollector0.1; dotted lines in Fig. 1d). Droplet and raindrop breakup is not included in Pyrcel.

Fig. 1.
Fig. 1.

Collection kernel (10−9 m3 s−1) of Hall (1980) on (a) linear axes and (b) log axes. (c) As in (b), but with the collection kernel by Chen et al. (2016, 2018a). (d) The difference between (b) and (c). The solid and dashed lines in (d) indicate the size ranges where the collection kernel is explicitly defined by Chen et al. (2016, 2018a) and is linearly interpolated to smoothly transition into the Hall’s kernel, respectively. (e) Collection efficiencies of Hall (1980, see their Table 1) are mapped; colored dots indicate the “turbulent collection efficiencies” in Chen et al. (2018a). Temperature, pressure, relative humidity, and eddy dissipation rate here were set to 283.15 K, 900 hPa, 99%, and 300 cm2 s−3, respectively. The plot in (a) is comparable to Fig. 4 of Lee and Baik (2017).

Citation: Monthly Weather Review 152, 3; 10.1175/MWR-D-22-0175.1

Note that the original Pyrcel model does not require a separation between the size bins of aerosols and those of droplets, as it tracks the temporal evolution of the former into the latter regardless of their categories (i.e., bins change their sizes over time as they grow). This serves a purpose if condensational growth is the only process included, whereas it raises a problem when collision–coalescence is included as (i) the method by Bott (1998) requires logarithmically equidistant mass grids/bins that are fixed in size, and also (ii) the turbulent collision kernel and efficiency are not available for very small particles (i.e., r < 5 μm) as collision–coalescence plays a minor role for them. We have therefore modified the original model so that droplets are separated on logarithmically equidistant mass bins, following xk+1 = αxk in Bott [1998, their Eq. (4)] where xk is a mass of the kth bin; these 300 droplet bins have an arbitrarily chosen scaling factor of 5.7 (i.e., α = 21/5.7), which means that the mass doubles after every 5.7 bins. A mass distribution function g [g m−3 (lnr)−1] was calculated by gk=3xk2nk, where xk is a droplet mass in bin k and nk is its number distribution function, following Eq. (2) in Bott (1998). After condensational growth, the mass of newly sized droplets is distributed to one or two bins according to the Courant number [Eq. (13) in Bott 1998]. Once the parcel supersaturation exceeds aerosols’ critical supersaturation and the radius exceeds 1 μm, the mass of those aerosol particles gets transferred to 300 droplet bins that go through collision–coalescence besides condensation. It should be noted that while the calculations by Bott (1998) are mass conserving, there is some inevitable mass loss or gain upon the transfer of mass from “aerosol” bins to “droplet” bins; the original Pyrcel model, which only considers condensation, rigorously “tracks” aerosol growth by using moving size bins (i.e., all the bins increase in size as a parcel rises) while the number concentration in each bin remains unchanged. These bins cannot be directly used for collision–coalescence, as fixed and logarithmically equidistant grids are required in Bott (1998). Therefore, when the mass of activated aerosols gets transferred to fixed grids, there is some mass gain/loss to conserve the number of droplets. Given the fine bin widths of 300 droplet bins, however, the mass loss/gain is expected to be small.

In our parcel model simulations where the initial saturation ratio S was set to 99%, 10 variables characterized the initial conditions: temperature T, pressure P, vertical velocity w, eddy dissipation rate ε, and three variables that determine the initial lognormal distribution of aerosol particles (a number concentration N, a geometric mean radius rσ, and the geometric standard deviation σ) for two types of aerosols, including sulfate (hygroscopicity κsul = 0.54, 100 bins) and sea salt (κsea = 1.2, 40 bins). Note that w and ε are constant throughout a simulation.

To select 1000 sets of values from these 10 variables for running 1000 simulations, we have utilized the Latin-Hypercube data sampling method (LHS; McKay et al. 1979); in this method, a cumulative distribution function (CDF) of each variable is evenly split into 1000 ranges, one CDF value gets randomly sampled from each of the 1000 CDF ranges, and finally the variable value that has the sampled CDF value is chosen as one sample from each CDF range. In this fashion 1000 values are sampled. As an example, Fig. 2 shows how five samples are selected for each variable; the magenta values are randomly selected CDF values and the orange values are the variable values with corresponding CDF values. We assumed a uniform distribution (e.g., Fig. 2a) for T and P within the range of 5°–20°C and 800–1000 hPa, respectively, an exponential distribution (e.g., Fig. 2b) for w, and truncated normal distributions (e.g., Fig. 2c) for the other seven parameters. These distributions and ranges were arbitrarily chosen so that the commonly observed conditions for a warm cloud base are well represented. Note that the use of LHS assumes that each of the 10 variables follows its own distribution function independently of other parameters. This assumption may disregard some covariance among parameters that may exist in reality.

Fig. 2.
Fig. 2.

Examples of how five samples are chosen by the LHS method for variables with different distribution functions: (left) temperature with a uniform distribution, (center) updraft velocities with an exponential distribution, and (right) eddy dissipation rates with a truncated normal distribution. Blue and green lines show the cumulative (left axis) and probability (right axis) distribution functions, respectively. The LHS method evenly splits a CDF into the number of samples needed (i.e., five in this example), as indicated by the black divisions in the figure. Within each CDF range, a random value is chosen (magenta, y axis), which determines the sample value (orange, x axis).

Citation: Monthly Weather Review 152, 3; 10.1175/MWR-D-22-0175.1

A parcel keeps rising for 30 min after passing the water saturation level (S = 1.0) unless the parcel reaches T = −12°C and the simulation gets terminated. This temperature limit is due to the ice nucleation processes that may actively take place below −12°C in reality, which is not included in our parcel-model simulations. Within the 30 min, raindrop formation may or may not take place, depending on the aerosol and dynamical conditions. These are, however, all important data to train ML models for a variety of conditions.

b. Machine learning

Each of the 1000 parcel model simulations outputs dynamical and droplet variables every second, maximally for 30 min. As a result, we obtained three quarters of a million (i.e., 777 656) sets of variables in total, which were split into a training set (∼60%), validation (∼7%), and a testing set (33%).

The ML models that we trained have a structure of multilayer perceptron (MLP; MLPRegressor or MLPClassifier in the scikit-learn Python library; https://scikit-learn.org/), shown in Fig. 3; the values of the input features are passed to the “neurons” in the next hidden layer by getting multiplied by “weight” values. The passed values are summed at each of the neurons in the hidden layer, although the neurons are “turned off” (i.e., gets a value of 0) if the sum is negative, due to the activation function called the rectified linear unit function (ReLU; Nair and Hinton 2010). Similarly, the resultant values are passed to the neurons in the next hidden layer, whose values are in turn passed to the single output neuron. As is clear from the structure and weight values, MLP is equivalent to a polynomial, which was relatively easy to implement in a Fortran-based atmospheric model later on. A stochastic gradient-based optimization algorithm (Adam; Kingma and Ba 2014) was used for optimizing weights, with a constant learning rate of 0.001. As we used early stopping, the training iterations got terminated if the validation score (i.e., r2) did not improve by 10−4 or more for five consecutive iterations before reaching the maximum iterations of 100. The size of each training batch was 200, which is the default size in the python code used.

Fig. 3.
Fig. 3.

The MLP structure of the emulators in this study. The number at each neuron is an intercept parameter that gets added at the end of previous calculations, and the colors of the lines show the weights that previous neurons get multiplied by. These values are specifically for emulator 1 that predicts rain mass production rates dqr/dt.

Citation: Monthly Weather Review 152, 3; 10.1175/MWR-D-22-0175.1

In this study, values from 20 input features were passed to two hidden layers, each of which has 20 neurons. This structure was chosen partly due to its simple structure that facilitated its subsequent implementation in WRF, as well as its high prediction scores. The 20 input features are T; P; S; ε; w; droplet mass (g kg−1) within the radius ranges of 1–3, 3–5, 5–10, 10–20, 20–30, 30–40, 40–50, 50–100, 100–200, 200–1000, 1000–1500, 1500–2000, and >2000 μm; and the numbers of cloud droplets ≥ 50 μm (r = 50–100 μm) and certain sizes of raindrops (r = 200–1000 μm) (kg−1). Through analyses of correlations among input features, the number of input features was reduced from 53 to 20, as many of the 53 input features (i.e., 5 dynamical variables + qc, qr, nc, and nr each in 12 bins) are strongly correlated with each other. Even with this reduction of input features, the prediction score for dqr/dt remained higher than 0.99, while a significant reduction of computational costs can be expected upon its implementation in WRF later on. Table 1 lists the input features and their minimum and maximum values. As is clear from the table, we define cloud droplets as drops with radius r < 100 μm and raindrops as those with r ≥ 100 μm. This threshold value was chosen based on Rogers and Yau (1989), although it varies with studies. Note that values less than 10−30 were all set to 10−30 as smaller values are not accurately read in when the emulators are implemented in the Fortran-based WRF later on. Aside from these constrained values and the minimum T of −12°C mentioned above, the minimum and maximum values on Table 1 were either the smallest/largest quantities reached in 1000 Pyrcel simulations (e.g., droplet variables) or the values largely determined by the initial conditions chosen by LHS (e.g., T, ε). We took the logarithm of the latter 15 input features. All input features were normalized by subtracting the minimum value and dividing the result by the data range (maximum minus minimum) before they were used for training the ML models. The targets are a rain mass production rate dqr/dt, a change rate of rain number concentrations dNr/dt, and a change rate of cloud droplet number concentrations dNc/dt, all only through the processes accounted for in the Pyrcel model. Since dNr/dt and dNc/dt can be either positive or negative, we first utilized an MLP classifier to predict a two-class outcome (+1: positive, −1: negative) and subsequently used an MLP regressor for predicting the logarithm of the absolute values. If dqr/dt was smaller than 10−15, all three rates were set to 10−15, which was arbitrarily chosen according to the minimum allowed rates in the microphysics schemes described in the next subsection. Therefore, five ML models (two classifiers and three regressors) were trained for the three targets (dqr/dt,dNr/dt,dNc/dt) with the same sample data and the same MLP structure.

Table 1.

Input features and their minimum and maximum values in the training dataset.

Table 1.

c. Implementation in WRF

The emulators were coded in Fortran simply as polynomials with a number of weights and intercept parameters. Note that the absolute values of some weights were smaller than 10−30, in which case these weight values were set to zero in order to avoid incorrect reading of the extreme small values by Fortran. This modification only had a negligible impact on the prediction skills of the emulators as shown later. This ML Fortran code was implemented in the two-moment Morrison scheme (Morrison et al. 2009) and also the two-moment Thompson scheme (Thompson et al. 2008) in the Weather Research and Forecasting (WRF) Model, version 3.8.1 (Skamarock et al. 2008). As the cloud droplet number concentration is fixed in both of the microphysics schemes (i.e., set to 100 cm−3 in both schemes), only three of the emulators were used for predicting dqr/dt and dNr/dt. The necessary input features, listed in Table 1, were calculated from cloud droplet and raindrop size distributions in the microphysics schemes. Using these values, the hard-coded emulators calculated dqr/dt and dNr/dt. In our method, therefore, the emulators do not directly modify size distributions, but only indirectly via the changes in dqr/dt and dNr/dt.

In the Morrison scheme, the emulator for dqr/dt (emulator 1) replaced the sum of autoconversion (PRC in the code) and droplet accretion by rain (PRA), whereas the emulators for dNr/dt (emulators 2 and 3) replaced the sum of autoconversion (NPRC1), droplet accretion by rain (NPRA), and raindrop self-collection/breakup (NRAGG). The upper limit on ice number concentration that was originally in place in the Morrison scheme was commented out. In the Thompson scheme, emulator 1 replaced the sum of autoconversion (prr_wau) and droplet collection by rain (prr_rcw) and emulators 2 and 3 replaced the sum of autoconversion (pnr_wau) and raindrop self-collection (pnr_rcr). These variables are summarized in Table 2. Note that condensation was not replaced, as these schemes utilize saturation adjustment. In both of the schemes, the emulators were utilized only when the input features fell within the trained ranges, which are listed on Table 1; out of these ranges, the original parameterizations for the above microphysical processes were used.

Table 2.

Summary of microphysical processes replaced by the emulators in the WRF simulations.

Table 2.

It should be noted that the direct implementation of the python-based Lagrangian-grid parcel model into the Fortran-based Eulerian WRF Model was not done in the current work and technically beyond the scope of this study. For the comparison of the original parcel-model simulations with the emulators, therefore, their results were compared outside of WRF Model (i.e., offline) instead, as will be shown in section 3b. The results need to be interpreted with a caveat that the high performance of an ML model offline may not necessarily lead to its high performance online within another parent model, as was shown by Seifert and Rasp (2020). Although Seifert and Rasp (2020) used two or more ML models within one prognostic equation, which is different from this study, their major finding was that the rigorous calculations by emulators may not outperform those by an existing parameterization most likely because the parent two-moment model inherently does not describe initial droplet evolution well in some cases and requires additional physically based assumptions, which may not necessarily be included in data-based emulators. This point needs to be kept in mind throughout this study.

We have run WRF simulations for two idealized cases in order to compare the ML-based simulations with the original runs.

The first idealized case is a simulation of a shallow convection, originally given to the WRF users as a case to test large eddy simulations in the boundary layer (em_les). The model top is therefore at 2 km and the involved microphysical processes are predominantly warm-rain processes. The horizontal resolution is set to 100 m with 39 vertical levels over a 99 × 99 grid. Turbulent motions are triggered by random perturbations in the lower troposphere. We have run four simulations in total with different microphysics schemes: original Morrison, ML-based Morrison, original Thompson, and ML-based Thompson.

The second idealized case simulates a supercell storm, which is also publicly provided to the WRF users (em_quarter_ss). The simulations are initialized with the vertical profiles of temperature and winds by Weisman and Klemp (1982). For triggering the convection, a temperature perturbation is added in the lower troposphere at the center of the domain. Due to the initial wind profile with a quarter-circular vertical wind shear, the convection turns into a supercell that eventually splits into two. This quick test case was run for 3 h, even though the system starts leaving the domain after about 2 h due to the open boundary conditions. The detailed settings of the simulations follow the default settings: the horizontal resolution is 2 km with 40 vertical levels, 41 grid points for both x and y directions, no Coriolis force is included, and radiation is turned off. Similarly to the shallow convection case, four simulations have been run for this case too.

In both cases, turbulent dissipation rate ε was set to 200 cm2 s−3 as this value needed to be calculated from turbulent kinetic energy, which is not calculated in or passed to the microphysics schemes.

3. Results

This section presents the results and findings from the 1000 parcel-model simulations (section 3a), ML procedures (section 3b), and those from the idealized WRF simulations (section 3c).

a. Parcel-model simulations

The inclusion of the collision–coalescence process in the Pyrcel model can significantly broaden the size distribution of cloud droplets in some cases. Figure 4 shows an example evolution of droplet size distributions over time in the Pyrcel model starting with identical conditions where one (Fig. 4, top) includes only condensational growth and the other (Fig. 4, bottom) additionally includes collision–coalescence. It is evident that raindrops (i.e., r > 100 μm) form in the latter simulation in 20 min or so, whereas small droplets have hard time making larger drops in the condensation-only simulation and they keep accumulating around the same size over time. Thus, the combination of the two processes, condensation and collision–coalescence, is indispensable for rapid warm-rain formations. The initial conditions for the 1000 parcel-model simulations, characterized by the 10 variables (section 2a), were chosen by the LHS method and show the distributions that follow predefined distributions well in Fig. 5; a uniform distribution for T (Fig. 5a) and P (Fig. 5b), an exponential distribution for w (Fig. 5c), and truncated normal distributions for the other seven variables (Figs. 5d–j) as represented by the distributions of bins that roughly follow the predefined probability distribution function (PDF) in red, regardless of training (magenta) or testing (green) data. This indicates a successful sampling of initial condition values according to the given distributions for all 10 variables. In these 1000 simulations, a wide range of rain production rates dqr/dt were predicted from wide data ranges of 20 input features (see Fig. A1 for their distributions).

Fig. 4.
Fig. 4.

Example evolution of drop size distributions with (top) condensation only (i.e., the original Pyrcel model) and (bottom) both condensation and collision–coalescence. The initial conditions are identical for the two.

Citation: Monthly Weather Review 152, 3; 10.1175/MWR-D-22-0175.1

Fig. 5.
Fig. 5.

Cumulative (blue, left axis) and probability (red, right axis) distribution functions of 10 variables that characterize the initial conditions of the parcel-model simulations: (a) temperature T (K), (b) pressure P (Pa), (c) updraft speed w (m s−1), (d) eddy dissipation rate ε (cm2 s−3), (e) number of sulfate aerosols Nsulfate (cm−3), (f) number of sea salt aerosols Nseasalt (cm−3), (g) mean radius of sulfate aerosols rsulfate (μm), (h) mean radius of sea salt aerosols rseasalt (μm), (i) standard deviation of the sulfate size distribution σsulfate (unitless), and (j) standard deviation of the sea salt size distribution σseasalt (unitless). Magenta and green distributions in the background show the actual probability distribution functions of training and testing datasets, respectively.

Citation: Monthly Weather Review 152, 3; 10.1175/MWR-D-22-0175.1

b. Machine learning

Using the wide range of rain production rates calculated by the parcel model, emulator 1 was trained for dqr/dt, emulators 2 and 3 for the signs and absolute values of dNr/dt, respectively, and emulators 4 (signs) and 5 (absolute values) for dNc/dt. All of the emulators exhibit the prediction scores or accuracies of 0.985 or higher for the testing datasets when compared to the parcel-model simulations: here a score refers to a coefficient of determination R2 for the prediction of absolute values (i.e., emulators 1, 3, and 5), whereas an accuracy is used for evaluating the prediction of signs (i.e., emulators 2 and 4). Figure 6a shows the comparison of predicted and calculated dqr/dt where the highest frequency of occurrence is found along the x = y line, indicating an excellent performance of emulator 1 with a score of 0.998 in making predictions. As a reference, predictions for the same testing datasets by other commonly used bulk parameterizations, three of which only include autoconversion and the other two take into account autoconversion and accretion, are shown in Figs. 6b–f. It is clear from this figure that the prediction of dqr/dt by emulator 1 is by far the closest to the results of the bin model, even though the parameterizations in Figs. 6b–d are only for the autoconversion process rather than all of the droplet growth processes. Those in Figs. 6e and 6f include both autoconversion and accretion. All of the parameterizations in Figs. 6b–f had hard time predicting very small values, and for the three parameterizations in Figs. 6b–d, they predicted the autoconversion rate to be zero (or 10−15) even when the bin model predicted a wide range of rates, as indicated by the colored vertical line at x = 10−15. The predictions of the signs of dNr/dt are very accurate (0.985 in accuracy) as shown in Fig. 6g, and so is the predictions of their absolute values (0.996). As a result, the overall prediction of dNr/dt shows a high accuracy as well (Fig. 6h). Although not used for the implementation in WRF in the next subsection, emulators 4 and 5 for dNc/dt also exhibit high scores of 0.992 (in accuracy for signs) and 0.989 (in R2 for absolute values), respectively. Note that almost identical scores were attained even when some weights, whose absolute values are smaller than 10−30, were set to zero, in order to avoid passing extremely small values to the WRF Fortran code where they cannot be accurately read in.

Fig. 6.
Fig. 6.

Frequency of occurrence (%) of predicted (x axis) and calculated (y axis) dqr/dt (g kg−1 s−1) whose predictions were made by (a) emulator 1, (b) Kessler (1969), (c) Manton and Cotton (1977), (d) Liu and Daum (2004), (e) Khairoutdinov and Kogan (2000), and (f) Seifert and Beheng (2001) with υ = 0 for the testing dataset. For (b)–(d), the mass of cloud droplets (i.e., r ≥ 1 μm) was used for the calculations. For (e) and (f), masses of both cloud droplets and raindrops were needed, and the separation between the two was set at 40 μm, following Seifert and Beheng (2001). (g) Raw counts of (from left to right) true positive (T+), true negative (T−), false positive (F+), and false negative (F−) of the emulator prediction of the signs of droplet number change rates dNr/dt (kg−1 s−1) for the testing dataset. (h) As in (a), but for dNr/dt. Note the difference in the axis ranges as dNr/dt can be negative: the blue range represents T−, yellow for F−, green for F+, and red for T+. Some weight parameters, whose absolute values are smaller than 10−30, were set to zero in the emulators’ predictions in this figure.

Citation: Monthly Weather Review 152, 3; 10.1175/MWR-D-22-0175.1

Although emulator 1 predicts dqr/dt most closely to the Pyrcel calculations, Fig. 6 shows that all the other parameterizations predict dqr/dt to be much larger by orders of magnitudes; for instance, when the emulator predicts dqr/dt to be ∼10−10, the Kessler parameterization may predict it to be on the order of ∼10−3. This difference stems from the fact that the Pyrcel model is a Lagrangian-grid model where the cloud evolution within a parcel is tracked or followed, whereas the other parameterizations are typically used in Eulerian models, such as WRF, where the cloud evolution at a spatially fixed point is calculated. Grid point values in Eulerian models typically represent average values of multiple or numerous individual parcels that mix and coexist within a certain space. It is therefore understandable that a significantly wider range of dqr/dt is predicted by the emulator, particularly on the smaller end (i.e., the larger end is limited by the amount of water vapor in the atmosphere). To use the Lagrangian-grid-based emulator in the Eulerian-model WRF, we “activate” the emulator in WRF only when the predicted dqr/dt falls above 10−6 (g kg−1 s−1) in the following subsection. Below this threshold, the original calculation of dqr/dt by a WRF parameterization is used. This threshold value was chosen as the majority of parameterization-predicted dqr/dt (Figs. 6b–f; x axis) falls above this value.

c. WRF simulations with ML-based parameterizations

The simulated evolution and the characteristics of clouds appear to clearly depend on the microphysics scheme. Figures 7a and 7e shows the temporal evolution of the surface rainfall in the shallow convective case, which remains extremely low in the Morrison run (Fig. 7a, blue) compared to the amount in the Thompson run (Fig. 7e, blue). As these are idealized simulations, their comparisons to observation are not possible, and hence “which microphysics scheme is more accurate” cannot be discussed, or it requires real-data simulations for real case studies. However, it is clear that the use of different microphysics schemes, which signifies the use of completely different sets of parameterizations for tens of microphysical processes in clouds, influences the cloud characteristics quite significantly, while the overall development of the modeled cloud itself is largely controlled by the dynamical conditions that were identical for all the simulations.

Fig. 7.
Fig. 7.

(a) Time evolution of domain-mean precipitation rate (mm h−1, solid, left axis) and accumulated precipitation (mm, dashed, right axis) in the simulations with the original Morrison scheme (blue) and the Morrison scheme using the emulators (red) for the shallow convective case. (b) Horizontal distribution of 3-h accumulated precipitation (mm) with the original Morrison scheme, (c) the same with the emulators, and (d) the difference between the two. (e)–(h) As in (a)–(d), but for the simulations using the Thompson scheme.

Citation: Monthly Weather Review 152, 3; 10.1175/MWR-D-22-0175.1

When the ML-based parameterization is used in those schemes, surface rainfall was immediately produced (Figs. 7c,g), likely owing to the turbulence-driven enhancement of collision–coalescence among small droplets that is expected from the use of the turbulence collision kernel (Chen et al. 2016, 2018a) and ε of 200 cm2 s−3. In the Thompson simulation with the ML parameterization, the rainfall is earlier to commence (Fig. 7e) than it does in Morrison-ML (Fig. 7a). It is probable that this difference in precipitation timing between the Morrison and Thompson runs stems from the different calculations of shape parameters for the droplet size distributions in the two schemes, which may have led to different rain formation rates for given cloud mass. Figure 8 shows horizontally averaged liquid cloud and rain mass concentrations for all the four simulations. In both of the schemes, a reduction in cloud mass (Figs. 8a–c,g–i) and an increase in rain mass (Figs. 8d–f,j–l) are seen when the ML-based parameterization is used. These results suggest that the ML-based parameterization facilitated the raindrop formation in the simulations for the shallow convection. When it comes to the accuracy of the results, however, the evaluation requires a comparison with observations as stated above.

Fig. 8.
Fig. 8.

Time evolutions of horizontal mean (a),(b),(g),(h) cloud and (d),(e),(j),(k) rain mass concentrations (g m−3) in (a),(d) the original Morrison, (b),(e) the modified Morrison, (g),(j) the original Thompson, and (h),(k) the modified Thompson simulations, for the shallow convective case. (right) The differences between the left and center columns.

Citation: Monthly Weather Review 152, 3; 10.1175/MWR-D-22-0175.1

As for the supercell case, the results are more complex. Unlike the first case, the cloud development in this case involves ice-phase processes that play a crucial role in the storm evolution, invigoration, and the resultant surface rainfall. Therefore, changes in the droplet-to-rain processes introduced by the ML parameterization may not be reflected in surface rainfall as straightforwardly as in the first case. Figure 9 shows the temporal evolution of the rainfall and the horizontal distributions of the total accumulated rainfall. Although a reduction of rainfall is seen in the Morrison run with the ML parameterization (Fig. 9a), it shows a very small difference in the Thompson scheme until the storm starts leaving the simulation domain at around 2 h (Fig. 9e). When the horizontal pattern of the surface rainfall is compared, the overall reduction is seen in the Morrison run (Figs. 9b–d), whereas both a reduction and an increase are seen in the Thompson run (Figs. 9f–h).

Fig. 9.
Fig. 9.

As in Fig. 7, but for the supercell case.

Citation: Monthly Weather Review 152, 3; 10.1175/MWR-D-22-0175.1

Now we look further closely into different responses of simulated clouds to the ML-based parameterization. Figure 10 shows the horizontally averaged mean mass concentrations of hydrometeors in the Morrison runs. According to this figure, the response of the modeled supercell to the ML parameterization is relatively straightforward: liquid cloud mass drastically increases (Figs. 10a–c), which is likely related to the reduction in rain mass (Figs. 10d–f) as fewer cloud droplets are converted to raindrops. These cloud droplets that do not rain out via the warm-rain process remain in the cloud system and then freeze at higher altitudes in the atmosphere, which explains the increased ice mass (Figs. 10j–l). This change in ice mass, however, does not seem to impact larger frozen hydrometeors as much (Figs. 10m–r). Thus, the reduction in the amount of surface rainfall in the Morrison scheme with the ML parameterization (Fig. 9) can be explained by the slower conversion of cloud droplets into raindrops. This tendency is opposite to what was found in the first idealized case, which highlights the case dependence of the impacts of the ML-based parameterization.

Fig. 10.
Fig. 10.

Time evolutions of horizontal mean concentrations of (a),(b) cloud mass (g m−3), (d),(e) rain mass (g m−3), (g),(h) raindrop number (L−1), (j),(k) ice mass (g m−3), (m),(n) snow mass (g m−3), and (p),(q) graupel mass (g m−3) in (left) the original Morrison simulation and (center) the Morrison simulation with the emulators for the supercell case. (right) The differences between the left and center columns.

Citation: Monthly Weather Review 152, 3; 10.1175/MWR-D-22-0175.1

As for the Thompson scheme, the response of the supercell is more complex. Figure 11, equivalent to Fig. 10 but for the Thompson runs, shows that liquid cloud and ice masses are barely impacted by the ML-based parameterization (Figs. 11a–c,j–l). On the other hand, raindrop number concentrations are substantially impacted by the use of the ML parameterization (Fig. 11i). Likely due to this reduction in raindrop number concentrations, graupel mass is reduced (Figs. 11p,r), even though this reduction in graupel is not simply reflected on the surface rainfall (note the difference in the temporal pattern in Figs. 11f,r). As mentioned earlier, the rainfall seems to reduce and increase at different parts of the supercell with the Thompson scheme (Figs. 9f–h). Such complex structural changes cannot be shown in horizontal averages in Fig. 11. Thus, reduced raindrop number and graupel mass likely play different roles in different parts of the storm, resulting in different responses of surface rainfall.

Fig. 11.
Fig. 11.

As in Fig. 10, but for the simulations with the original and modified Thompson schemes.

Citation: Monthly Weather Review 152, 3; 10.1175/MWR-D-22-0175.1

Since the motivation for utilizing such an ML method is to obtain the results of bin-model calculations with a high computational efficiency, the computational costs that these ML schemes required are noted here. For the shallow convective case (3 h, Δt = 1 s), the total simulation time was approximately 18 min with ∼0.10 s per time step on average in the original simulations (both the Morrison and Thompson schemes), which increased to approximately 35 and 45 min of the total simulation time and 0.19 and 0.25 s per time step on average in the Morrison-ML and Thompson-ML runs, respectively. In the supercell case (3 h, Δt = 12 s), the total run time was only a few minutes in all the simulations, although the time per time step increased from ∼0.14 s (original Morrison and Thompson) to 0.25 (Morrison-ML) and 0.30 s (Thompson-ML). Note that these specific lengths of time are dependent on the simulation configurations and the number of computational processors used. Therefore, the computational time increased by a factor of about 1.7–1.9 (Morrison-ML) and 2.1–2.6 (Thompson-ML) compared to the original simulations. Considering the computational cost of utilizing a bin-microphysics scheme, even for replacing a few microphysical processes, this increase in the computational time is reasonable especially over a domain that includes hundreds or thousands of grid points. The means to further reduce this computational burden is going to be sought in future studies.

4. Conclusions

The present study has applied the ML method to derive parameterizations of cloud microphysical processes, particularly those for raindrop formation. The microphysical dataset came from a set of 1000 bin-model simulations of these processes in a parcel model where the activation of aerosols was explicitly calculated. These simulations output drop size distributions and resultant raindrop formation rates (dqr/dt and dNr/dt) in rising air parcels every second. Five ML models have been trained in total, one for dqr/dt, two for dNr/dt, and two for dNc/dt. The emulators attained the high prediction scores of ≥0.985 in replicating the offline (i.e., outside of WRF) bin-model calculations in Pyrcel. The resultant emulators for dqr/dt and dNc/dt have then been tested in the WRF Model simulations with prescribed Nc, where the conventional calculations of raindrop formation were replaced by the emulators in a case of shallow convection and in another case of supercell. From these simulations, we have found that 1) raindrop formation was enhanced by the emulators in warm clouds particularly under a shallow convective condition, and 2) the response of deep convection to the emulators was heavily scheme-dependent, as the changes in the warm part of the cloud can lead to different chains of microphysical rate changes in different schemes, especially in the cold part of the cloud that contributes significantly to the overall surface rainfall.

Through this study, it has been demonstrated that emulators can mimic the calculations of bin models and can be implemented into a weather model at a much lower computational calculation cost than bin models. If more microphysical processes are represented by emulators, the overall accuracy of cloud characteristics may increase. Some microphysical processes, however, may not be well represented even by bin-model calculations, which require further process-based studies before the application of ML (e.g., secondary ice production). The accuracy of the simulations with the ML-based calculations against observations needs to be evaluated through real-data WRF simulations in the future.

Acknowledgments.

This work was done as a part of the Make Our Planet Great Again project. The work was funded by the l’Agence National de la Recherche (ANR) of France under the Programme d’Investissements d’Avenir (ANR-18-MOPGA-003 EUROACE) and co-funded by the French region of Occitanie. This work was granted access to the high-performance computing (HPC) resources of l’Institut du Développement et des Ressources en Informatique Scientifique (IDRIS) made by le Grand Équipement National de Calcul Intensif (GENCI) of France, under the allocation for Grants A0110110967 and A0090110967. The computation of the work was performed also using the HPC resources from the French regional supercomputer CALMIP under a grant for Project P18025. We appreciate the technical support provided by D. Rothenberg for running the Pyrcel model. We also thank S. Chen for the discussion on the turbulent collision kernel and coefficiencies in Chen et al. (2016, 2018a). Last, we thank the three anonymous reviewers for their constructive comments and suggestions that helped us significantly improve the paper.

Data availability statement.

The Pyrcel model is publicly available on its website at https://pyrcel.readthedocs.io/en/latest/index.html. The WRF Model is also publicly available on the University Corporation for Atmospheric Research (UCAR) website at https://www2.mmm.ucar.edu/wrf/users/download/get_source.html. The specific modifications made to the source codes of the Pyrcel model as well as the modified codes of the WRF Model and the WRF simulation data are available at https://doi.org/10.5281/zenodo.7919545.

APPENDIX

Input Features from the Training and Testing Datasets

This appendix provides an additional figure (Fig. A1) that is not essential but would complement readers’ understanding of the article.

Fig. A1.
Fig. A1.

PDFs of input features from the training (blue) and testing (red) datasets. (f)–(t) Samples smaller than 10−30 were all set to 10−30 as the Fortran code was not able to accurately read in smaller values; the numbers on the plots show the percentages of values smaller than 10−30. Note that (d) updraft w and (e) eddy dissipation rate ε show slightly different distributions from those in Figs. 5c and 5d, even though they remain constant throughout a simulation. This is because each of 1000 simulations output different numbers of outputs depending on how long the simulations lasted, and therefore, their PDFs differ slightly from those in Figs. 5c and 5d before the simulations.

Citation: Monthly Weather Review 152, 3; 10.1175/MWR-D-22-0175.1

REFERENCES

  • Beard, K. V., 1976: Terminal velocity and shape of cloud and precipitation drops aloft. J. Atmos. Sci., 33, 851864, https://doi.org/10.1175/1520-0469(1976)033<0851:TVASOC>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Bott, A., 1998: A flux method for the numerical solution of the stochastic collection equation. J. Atmos. Sci., 55, 22842293, https://doi.org/10.1175/1520-0469(1998)055<2284:AFMFTN>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Brenowitz, N. D., and C. S. Bretherton, 2018: Prognostic validation of a neural network unified physics parameterization. Geophys. Res. Lett., 45, 62896298, https://doi.org/10.1029/2018GL078510.

    • Search Google Scholar
    • Export Citation
  • Chen, S., P. Bartello, M. K. Yau, P. A. Vaillancourt, and K. Zwijsen, 2016: Cloud droplet collisions in turbulent environment: Collision statistics and parameterization. J. Atmos. Sci., 73, 621636, https://doi.org/10.1175/JAS-D-15-0203.1.

    • Search Google Scholar
    • Export Citation
  • Chen, S., M. K. Yau, and P. Bartello, 2018a: Turbulence effects of collision efficiency and broadening of droplet size distribution in cumulus clouds. J. Atmos. Sci., 75, 203217, https://doi.org/10.1175/JAS-D-17-0123.1.

    • Search Google Scholar
    • Export Citation
  • Chen, S., M. K. Yau, P. Bartello, and L. Xue, 2018b: Bridging the condensation–collision size gap: A direct numerical simulation of continuous droplet growth in turbulent clouds. Atmos. Chem. Phys., 18, 72517262, https://doi.org/10.5194/acp-18-7251-2018.

    • Search Google Scholar
    • Export Citation
  • Chiu, J. C., C. K. Yang, P. J. van Leeuwen, G. Feingold, R. Wood, Y. Blanchard, F. Mei, and J. Wang, 2021: Observational constraints on warm cloud microphysical processes using machine learning and optimization techniques. Geophys. Res. Lett., 48, e2020GL091236, https://doi.org/10.1029/2020GL091236.

    • Search Google Scholar
    • Export Citation
  • Gentine, P., M. Pritchard, S. Rasp, G. Reinaudi, and G. Yacalis, 2018: Could machine learning break the convection parameterization deadlock? Geophys. Res. Lett., 45, 57425751, https://doi.org/10.1029/2018GL078202.

    • Search Google Scholar
    • Export Citation
  • Gettelman, A., D. J. Gagne, C.-C. Chen, M. W. Christensen, Z. J. Lebo, H. Morrison, and G. Gantos, 2021: Machine learning the warm rain process. J. Adv. Model. Earth Syst., 13, e2020MS002268, https://doi.org/10.1029/2020MS002268.

    • Search Google Scholar
    • Export Citation
  • Hall, W. D., 1980: A detailed microphysical model within a two-dimensional dynamic framework: Model description and preliminary results. J. Atmos. Sci., 37, 24862507, https://doi.org/10.1175/1520-0469(1980)037<2486:ADMMWA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Herman, G. R., and R. S. Schumacher, 2018: Money doesn’t grow on trees, but forecasts do: Forecasting extreme precipitation with random forests. Mon. Wea. Rev., 146, 15711600, https://doi.org/10.1175/MWR-D-17-0250.1.

    • Search Google Scholar
    • Export Citation
  • Hsieh, W. C., H. Jonsson, L.-P. Wang, G. Buzorius, R. C. Flagan, J. H. Seinfeld, and A. Nenes, 2009: On the representation of droplet coalescence and autoconversion: Evaluation using ambient cloud droplet size distributions. J. Geophys. Res., 114, D07201, https://doi.org/10.1029/2008JD010502.

    • Search Google Scholar
    • Export Citation
  • Igel, A. L., H. Morrison, S. P. Santos, and M. van Lier-Walqui, 2022: Limitations of separate cloud and rain categories in parameterizing collision-coalescence for bulk microphysics schemes. J. Adv. Model. Earth Syst., 14, e2022MS003039, https://doi.org/10.1029/2022MS003039.

    • Search Google Scholar
    • Export Citation
  • Kamangir, H., W. Collins, P. Tissot, and S. A. King, 2020: A deep-learning model to predict thunderstorms within 400 km2 south Texas domains. Meteor. Appl., 27, e1905, https://doi.org/10.1002/met.1905.

    • Search Google Scholar
    • Export Citation
  • Kessler, E., 1969: On the Distribution and Continuity of Water Substance in Atmospheric Circulations. Meteor. Monogr., No. 32, Amer. Meteor. Soc., 84 pp.

  • Khairoutdinov, M., and Y. Kogan, 2000: A new cloud physics parameterization in a large-eddy simulation model of marine stratocumulus. Mon. Wea. Rev., 128, 229243, https://doi.org/10.1175/1520-0493(2000)128<0229:ANCPPI>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Kingma, D. P., and J. Ba, 2014: Adam: A method for stochastic optimization. arXiv, 1412.6980, https://doi.org/10.48550/ARXIV.1412.6980.

  • Kogan, Y. L., and A. Belochitski, 2012: Parameterization of cloud microphysics based on full integral moments. J. Atmos. Sci., 69, 22292242, https://doi.org/10.1175/JAS-D-11-0268.1.

    • Search Google Scholar
    • Export Citation
  • Krasnopolsky, V. M., M. S. Fox-Rabinovitz, and A. A. Belochitski, 2013: Using ensemble of neural networks to learn stochastic convection parameterizations for climate and numerical weather prediction models from data simulated by a cloud resolving model. Adv. Artif. Neural Syst., 2013, 485913, https://doi.org/10.1155/2013/485913.

    • Search Google Scholar
    • Export Citation
  • Lagerquist, R., A. McGovern, and T. Smith, 2017: Machine learning for real-time prediction of damaging straight-line convective wind. Wea. Forecasting, 32, 21752193, https://doi.org/10.1175/WAF-D-17-0038.1.

    • Search Google Scholar
    • Export Citation
  • Lee, H., and J.-J. Baik, 2017: A physically based autoconversion parameterization. J. Atmos. Sci., 74, 15991616, https://doi.org/10.1175/JAS-D-16-0207.1.

    • Search Google Scholar
    • Export Citation
  • Liu, Y., and P. H. Daum, 2004: Parameterization of the autoconversion process. Part I: Analytical formulation of the Kessler-type parameterizations. J. Atmos. Sci., 61, 15391548, https://doi.org/10.1175/1520-0469(2004)061<1539:POTAPI>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Long, A. B., 1974: Solutions to the droplet collection equation for polynomial kernels. J. Atmos. Sci., 31, 10401052, https://doi.org/10.1175/1520-0469(1974)031<1040:STTDCE>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Manton, M. J., and W. R. Cotton, 1977: Formulation of approximate equations for modeling moist deep convection on the mesoscale. Colorado State University Atmospheric Science Paper 266, 73 pp., https://api.mountainscholar.org/server/api/core/bitstreams/8d9cfb59-0a6e-4c89-a27c-c36322917e29/content.

  • McGovern, A., K. L. Elmore, D. J. Gagne II, S. E. Haupt, C. D. Karstens, R. Lagerquist, T. Smith, and J. K. Williams, 2017: Using artificial intelligence to improve real-time decision-making for high-impact weather. Bull. Amer. Meteor. Soc., 98, 20732090, https://doi.org/10.1175/BAMS-D-16-0123.1.

    • Search Google Scholar
    • Export Citation
  • McKay, M. D., R. J. Beckman, and W. J. Conover, 1979: A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics, 21, 239245, https://doi.org/10.1080/00401706.1979.10489755.

    • Search Google Scholar
    • Export Citation
  • Morrison, H., G. Thompson, and V. Tatarskii, 2009: Impact of cloud microphysics on the development of trailing stratiform precipitation in a simulated squall line: Comparison of one- and two-moment schemes. Mon. Wea. Rev., 137, 9911007, https://doi.org/10.1175/2008MWR2556.1.

    • Search Google Scholar
    • Export Citation
  • Mostajabi, A., D. L. Finney, M. Rubinstein, and F. Rachidi, 2019: Nowcasting lightning occurrence from commonly available meteorological parameters using machine learning techniques. npj Climate Atmos. Sci., 2, 41, https://doi.org/10.1038/s41612-019-0098-0.

    • Search Google Scholar
    • Export Citation
  • Nair, V., and G. E. Hinton, 2010: Rectified linear units improve restricted Boltzmann machines. ICML’10: Proc. 27th Int. Conf. on Machine Learning, Haifa, Israel, ACM, 807–814, https://dl.acm.org/doi/10.5555/3104322.3104425.

  • O’Gorman, P. A., and J. G. Dwyer, 2018: Using machine learning to parameterize moist convection: Potential for modeling of climate, climate change, and extreme events. J. Adv. Model. Earth Syst., 10, 25482563, https://doi.org/10.1029/2018MS001351.

    • Search Google Scholar
    • Export Citation
  • Rasp, S., M. S. Pritchard, and P. Gentine, 2018: Deep learning to represent subgrid processes in climate models. Proc. Natl. Acad. Sci. USA, 115, 96849689, https://doi.org/10.1073/pnas.1810286115.

    • Search Google Scholar
    • Export Citation
  • Rodríguez Genó, C. F., and L. Alfonso, 2022: Parameterization of the collision–coalescence process using series of basis functions: COLNETv1.0.0 model development using a machine learning approach. Geosci. Model Dev., 15, 493507, https://doi.org/10.5194/gmd-15-493-2022.

    • Search Google Scholar
    • Export Citation
  • Rogers, R. R., and M. K. Yau, 1989: A Short Course in Cloud Physics. 3rd ed. Elsevier, 304 pp.

  • Rothenberg, D., and C. Wang, 2016: Metamodeling of droplet activation for global climate models. J. Atmos. Sci., 73, 12551272, https://doi.org/10.1175/JAS-D-15-0223.1.

    • Search Google Scholar
    • Export Citation
  • Rüttgers, M., S. Lee, S. Jeon, and D. You, 2019: Prediction of a typhoon track using a generative adversarial network and satellite images. Sci. Rep., 9, 6057, https://doi.org/10.1038/s41598-019-42339-y.

    • Search Google Scholar
    • Export Citation
  • Seifert, A., and K. D. Beheng, 2001: A double-moment parameterization for simulating autoconversion, accretion and selfcollection. Atmos. Res., 5960, 265281, https://doi.org/10.1016/S0169-8095(01)00126-0.

    • Search Google Scholar
    • Export Citation
  • Seifert, A., and S. Rasp, 2020: Potential and limitations of machine learning for modeling warm-rain cloud microphysical processes. J. Adv. Model. Earth Syst., 12, e2020MS002301, https://doi.org/10.1029/2020MS002301.

    • Search Google Scholar
    • Export Citation
  • Shima, S., K. Kusano, A. Kawano, T. Sugiyama, and S. Kawahara, 2009: The super-droplet method for the numerical simulation of clouds and precipitation: A particle-based and probabilistic microphysics model coupled with a non-hydrostatic model. Quart. J. Roy. Meteor. Soc., 135, 13071320, https://doi.org/10.1002/qj.441.

    • Search Google Scholar
    • Export Citation
  • Skamarock, W. C., and Coauthors, 2008: A description of the Advanced Research WRF version 3. NCAR Tech. Note NCAR/TN-475+STR, 113 pp., https://doi.org/10.5065/D68S4MVH.

  • Thompson, G., P. R. Field, R. M. Rasmussen, and W. D. Hall, 2008: Explicit forecasts of winter precipitation using an improved bulk microphysics scheme. Part II: Implementation of a new snow parameterization. Mon. Wea. Rev., 136, 50955115, https://doi.org/10.1175/2008MWR2387.1.

    • Search Google Scholar
    • Export Citation
  • Weisman, M. L., and J. B. Klemp, 1982: The dependence of numerically simulated convective storms on vertical wind shear and buoyancy. Mon. Wea. Rev., 110, 504520, https://doi.org/10.1175/1520-0493(1982)110%3C0504:TDONSC%3E2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Xue, Y., L.-P. Wang, and W. W. Grabowski, 2008: Growth of cloud droplets by turbulent collision–coalescence. J. Atmos. Sci., 65, 331356, https://doi.org/10.1175/2007JAS2406.1.

    • Search Google Scholar
    • Export Citation
  • Zheng, G., X. Li, R.-H. Zhang, and B. Liu, 2020: Purely satellite data–driven deep learning forecast of complicated tropical instability waves. Sci. Adv., 6, eaba1482, https://doi.org/10.1126/sciadv.aba1482.

    • Search Google Scholar
    • Export Citation
Save
  • Beard, K. V., 1976: Terminal velocity and shape of cloud and precipitation drops aloft. J. Atmos. Sci., 33, 851864, https://doi.org/10.1175/1520-0469(1976)033<0851:TVASOC>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Bott, A., 1998: A flux method for the numerical solution of the stochastic collection equation. J. Atmos. Sci., 55, 22842293, https://doi.org/10.1175/1520-0469(1998)055<2284:AFMFTN>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Brenowitz, N. D., and C. S. Bretherton, 2018: Prognostic validation of a neural network unified physics parameterization. Geophys. Res. Lett., 45, 62896298, https://doi.org/10.1029/2018GL078510.

    • Search Google Scholar
    • Export Citation
  • Chen, S., P. Bartello, M. K. Yau, P. A. Vaillancourt, and K. Zwijsen, 2016: Cloud droplet collisions in turbulent environment: Collision statistics and parameterization. J. Atmos. Sci., 73, 621636, https://doi.org/10.1175/JAS-D-15-0203.1.

    • Search Google Scholar
    • Export Citation
  • Chen, S., M. K. Yau, and P. Bartello, 2018a: Turbulence effects of collision efficiency and broadening of droplet size distribution in cumulus clouds. J. Atmos. Sci., 75, 203217, https://doi.org/10.1175/JAS-D-17-0123.1.

    • Search Google Scholar
    • Export Citation
  • Chen, S., M. K. Yau, P. Bartello, and L. Xue, 2018b: Bridging the condensation–collision size gap: A direct numerical simulation of continuous droplet growth in turbulent clouds. Atmos. Chem. Phys., 18, 72517262, https://doi.org/10.5194/acp-18-7251-2018.

    • Search Google Scholar
    • Export Citation
  • Chiu, J. C., C. K. Yang, P. J. van Leeuwen, G. Feingold, R. Wood, Y. Blanchard, F. Mei, and J. Wang, 2021: Observational constraints on warm cloud microphysical processes using machine learning and optimization techniques. Geophys. Res. Lett., 48, e2020GL091236, https://doi.org/10.1029/2020GL091236.

    • Search Google Scholar
    • Export Citation
  • Gentine, P., M. Pritchard, S. Rasp, G. Reinaudi, and G. Yacalis, 2018: Could machine learning break the convection parameterization deadlock? Geophys. Res. Lett., 45, 57425751, https://doi.org/10.1029/2018GL078202.

    • Search Google Scholar
    • Export Citation
  • Gettelman, A., D. J. Gagne, C.-C. Chen, M. W. Christensen, Z. J. Lebo, H. Morrison, and G. Gantos, 2021: Machine learning the warm rain process. J. Adv. Model. Earth Syst., 13, e2020MS002268, https://doi.org/10.1029/2020MS002268.

    • Search Google Scholar
    • Export Citation
  • Hall, W. D., 1980: A detailed microphysical model within a two-dimensional dynamic framework: Model description and preliminary results. J. Atmos. Sci., 37, 24862507, https://doi.org/10.1175/1520-0469(1980)037<2486:ADMMWA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Herman, G. R., and R. S. Schumacher, 2018: Money doesn’t grow on trees, but forecasts do: Forecasting extreme precipitation with random forests. Mon. Wea. Rev., 146, 15711600, https://doi.org/10.1175/MWR-D-17-0250.1.

    • Search Google Scholar
    • Export Citation
  • Hsieh, W. C., H. Jonsson, L.-P. Wang, G. Buzorius, R. C. Flagan, J. H. Seinfeld, and A. Nenes, 2009: On the representation of droplet coalescence and autoconversion: Evaluation using ambient cloud droplet size distributions. J. Geophys. Res., 114, D07201, https://doi.org/10.1029/2008JD010502.

    • Search Google Scholar
    • Export Citation
  • Igel, A. L., H. Morrison, S. P. Santos, and M. van Lier-Walqui, 2022: Limitations of separate cloud and rain categories in parameterizing collision-coalescence for bulk microphysics schemes. J. Adv. Model. Earth Syst., 14, e2022MS003039, https://doi.org/10.1029/2022MS003039.

    • Search Google Scholar
    • Export Citation
  • Kamangir, H., W. Collins, P. Tissot, and S. A. King, 2020: A deep-learning model to predict thunderstorms within 400 km2 south Texas domains. Meteor. Appl., 27, e1905, https://doi.org/10.1002/met.1905.

    • Search Google Scholar
    • Export Citation
  • Kessler, E., 1969: On the Distribution and Continuity of Water Substance in Atmospheric Circulations. Meteor. Monogr., No. 32, Amer. Meteor. Soc., 84 pp.

  • Khairoutdinov, M., and Y. Kogan, 2000: A new cloud physics parameterization in a large-eddy simulation model of marine stratocumulus. Mon. Wea. Rev., 128, 229243, https://doi.org/10.1175/1520-0493(2000)128<0229:ANCPPI>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Kingma, D. P., and J. Ba, 2014: Adam: A method for stochastic optimization. arXiv, 1412.6980, https://doi.org/10.48550/ARXIV.1412.6980.

  • Kogan, Y. L., and A. Belochitski, 2012: Parameterization of cloud microphysics based on full integral moments. J. Atmos. Sci., 69, 22292242, https://doi.org/10.1175/JAS-D-11-0268.1.

    • Search Google Scholar
    • Export Citation
  • Krasnopolsky, V. M., M. S. Fox-Rabinovitz, and A. A. Belochitski, 2013: Using ensemble of neural networks to learn stochastic convection parameterizations for climate and numerical weather prediction models from data simulated by a cloud resolving model. Adv. Artif. Neural Syst., 2013, 485913, https://doi.org/10.1155/2013/485913.

    • Search Google Scholar
    • Export Citation
  • Lagerquist, R., A. McGovern, and T. Smith, 2017: Machine learning for real-time prediction of damaging straight-line convective wind. Wea. Forecasting, 32, 21752193, https://doi.org/10.1175/WAF-D-17-0038.1.

    • Search Google Scholar
    • Export Citation
  • Lee, H., and J.-J. Baik, 2017: A physically based autoconversion parameterization. J. Atmos. Sci., 74, 15991616, https://doi.org/10.1175/JAS-D-16-0207.1.

    • Search Google Scholar
    • Export Citation
  • Liu, Y., and P. H. Daum, 2004: Parameterization of the autoconversion process. Part I: Analytical formulation of the Kessler-type parameterizations. J. Atmos. Sci., 61, 15391548, https://doi.org/10.1175/1520-0469(2004)061<1539:POTAPI>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Long, A. B., 1974: Solutions to the droplet collection equation for polynomial kernels. J. Atmos. Sci., 31, 10401052, https://doi.org/10.1175/1520-0469(1974)031<1040:STTDCE>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Manton, M. J., and W. R. Cotton, 1977: Formulation of approximate equations for modeling moist deep convection on the mesoscale. Colorado State University Atmospheric Science Paper 266, 73 pp., https://api.mountainscholar.org/server/api/core/bitstreams/8d9cfb59-0a6e-4c89-a27c-c36322917e29/content.

  • McGovern, A., K. L. Elmore, D. J. Gagne II, S. E. Haupt, C. D. Karstens, R. Lagerquist, T. Smith, and J. K. Williams, 2017: Using artificial intelligence to improve real-time decision-making for high-impact weather. Bull. Amer. Meteor. Soc., 98, 20732090, https://doi.org/10.1175/BAMS-D-16-0123.1.

    • Search Google Scholar
    • Export Citation
  • McKay, M. D., R. J. Beckman, and W. J. Conover, 1979: A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics, 21, 239245, https://doi.org/10.1080/00401706.1979.10489755.

    • Search Google Scholar
    • Export Citation
  • Morrison, H., G. Thompson, and V. Tatarskii, 2009: Impact of cloud microphysics on the development of trailing stratiform precipitation in a simulated squall line: Comparison of one- and two-moment schemes. Mon. Wea. Rev., 137, 9911007, https://doi.org/10.1175/2008MWR2556.1.

    • Search Google Scholar
    • Export Citation
  • Mostajabi, A., D. L. Finney, M. Rubinstein, and F. Rachidi, 2019: Nowcasting lightning occurrence from commonly available meteorological parameters using machine learning techniques. npj Climate Atmos. Sci., 2, 41, https://doi.org/10.1038/s41612-019-0098-0.

    • Search Google Scholar
    • Export Citation
  • Nair, V., and G. E. Hinton, 2010: Rectified linear units improve restricted Boltzmann machines. ICML’10: Proc. 27th Int. Conf. on Machine Learning, Haifa, Israel, ACM, 807–814, https://dl.acm.org/doi/10.5555/3104322.3104425.

  • O’Gorman, P. A., and J. G. Dwyer, 2018: Using machine learning to parameterize moist convection: Potential for modeling of climate, climate change, and extreme events. J. Adv. Model. Earth Syst., 10, 25482563, https://doi.org/10.1029/2018MS001351.

    • Search Google Scholar
    • Export Citation
  • Rasp, S., M. S. Pritchard, and P. Gentine, 2018: Deep learning to represent subgrid processes in climate models. Proc. Natl. Acad. Sci. USA, 115, 96849689, https://doi.org/10.1073/pnas.1810286115.

    • Search Google Scholar
    • Export Citation
  • Rodríguez Genó, C. F., and L. Alfonso, 2022: Parameterization of the collision–coalescence process using series of basis functions: COLNETv1.0.0 model development using a machine learning approach. Geosci. Model Dev., 15, 493507, https://doi.org/10.5194/gmd-15-493-2022.

    • Search Google Scholar
    • Export Citation
  • Rogers, R. R., and M. K. Yau, 1989: A Short Course in Cloud Physics. 3rd ed. Elsevier, 304 pp.

  • Rothenberg, D., and C. Wang, 2016: Metamodeling of droplet activation for global climate models. J. Atmos. Sci., 73, 12551272, https://doi.org/10.1175/JAS-D-15-0223.1.

    • Search Google Scholar
    • Export Citation
  • Rüttgers, M., S. Lee, S. Jeon, and D. You, 2019: Prediction of a typhoon track using a generative adversarial network and satellite images. Sci. Rep., 9, 6057, https://doi.org/10.1038/s41598-019-42339-y.

    • Search Google Scholar
    • Export Citation
  • Seifert, A., and K. D. Beheng, 2001: A double-moment parameterization for simulating autoconversion, accretion and selfcollection. Atmos. Res., 5960, 265281, https://doi.org/10.1016/S0169-8095(01)00126-0.

    • Search Google Scholar
    • Export Citation
  • Seifert, A., and S. Rasp, 2020: Potential and limitations of machine learning for modeling warm-rain cloud microphysical processes. J. Adv. Model. Earth Syst., 12, e2020MS002301, https://doi.org/10.1029/2020MS002301.

    • Search Google Scholar
    • Export Citation
  • Shima, S., K. Kusano, A. Kawano, T. Sugiyama, and S. Kawahara, 2009: The super-droplet method for the numerical simulation of clouds and precipitation: A particle-based and probabilistic microphysics model coupled with a non-hydrostatic model. Quart. J. Roy. Meteor. Soc., 135, 13071320, https://doi.org/10.1002/qj.441.

    • Search Google Scholar
    • Export Citation
  • Skamarock, W. C., and Coauthors, 2008: A description of the Advanced Research WRF version 3. NCAR Tech. Note NCAR/TN-475+STR, 113 pp., https://doi.org/10.5065/D68S4MVH.

  • Thompson, G., P. R. Field, R. M. Rasmussen, and W. D. Hall, 2008: Explicit forecasts of winter precipitation using an improved bulk microphysics scheme. Part II: Implementation of a new snow parameterization. Mon. Wea. Rev., 136, 50955115, https://doi.org/10.1175/2008MWR2387.1.

    • Search Google Scholar
    • Export Citation
  • Weisman, M. L., and J. B. Klemp, 1982: The dependence of numerically simulated convective storms on vertical wind shear and buoyancy. Mon. Wea. Rev., 110, 504520, https://doi.org/10.1175/1520-0493(1982)110%3C0504:TDONSC%3E2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Xue, Y., L.-P. Wang, and W. W. Grabowski, 2008: Growth of cloud droplets by turbulent collision–coalescence. J. Atmos. Sci., 65, 331356, https://doi.org/10.1175/2007JAS2406.1.

    • Search Google Scholar
    • Export Citation
  • Zheng, G., X. Li, R.-H. Zhang, and B. Liu, 2020: Purely satellite data–driven deep learning forecast of complicated tropical instability waves. Sci. Adv., 6, eaba1482, https://doi.org/10.1126/sciadv.aba1482.

    • Search Google Scholar
    • Export Citation
  • Fig. 1.

    Collection kernel (10−9 m3 s−1) of Hall (1980) on (a) linear axes and (b) log axes. (c) As in (b), but with the collection kernel by Chen et al. (2016, 2018a). (d) The difference between (b) and (c). The solid and dashed lines in (d) indicate the size ranges where the collection kernel is explicitly defined by Chen et al. (2016, 2018a) and is linearly interpolated to smoothly transition into the Hall’s kernel, respectively. (e) Collection efficiencies of Hall (1980, see their Table 1) are mapped; colored dots indicate the “turbulent collection efficiencies” in Chen et al. (2018a). Temperature, pressure, relative humidity, and eddy dissipation rate here were set to 283.15 K, 900 hPa, 99%, and 300 cm2 s−3, respectively. The plot in (a) is comparable to Fig. 4 of Lee and Baik (2017).

  • Fig. 2.

    Examples of how five samples are chosen by the LHS method for variables with different distribution functions: (left) temperature with a uniform distribution, (center) updraft velocities with an exponential distribution, and (right) eddy dissipation rates with a truncated normal distribution. Blue and green lines show the cumulative (left axis) and probability (right axis) distribution functions, respectively. The LHS method evenly splits a CDF into the number of samples needed (i.e., five in this example), as indicated by the black divisions in the figure. Within each CDF range, a random value is chosen (magenta, y axis), which determines the sample value (orange, x axis).

  • Fig. 3.

    The MLP structure of the emulators in this study. The number at each neuron is an intercept parameter that gets added at the end of previous calculations, and the colors of the lines show the weights that previous neurons get multiplied by. These values are specifically for emulator 1 that predicts rain mass production rates dqr/dt.

  • Fig. 4.

    Example evolution of drop size distributions with (top) condensation only (i.e., the original Pyrcel model) and (bottom) both condensation and collision–coalescence. The initial conditions are identical for the two.

  • Fig. 5.

    Cumulative (blue, left axis) and probability (red, right axis) distribution functions of 10 variables that characterize the initial conditions of the parcel-model simulations: (a) temperature T (K), (b) pressure P (Pa), (c) updraft speed w (m s−1), (d) eddy dissipation rate ε (cm2 s−3), (e) number of sulfate aerosols Nsulfate (cm−3), (f) number of sea salt aerosols Nseasalt (cm−3), (g) mean radius of sulfate aerosols rsulfate (μm), (h) mean radius of sea salt aerosols rseasalt (μm), (i) standard deviation of the sulfate size distribution σsulfate (unitless), and (j) standard deviation of the sea salt size distribution σseasalt (unitless). Magenta and green distributions in the background show the actual probability distribution functions of training and testing datasets, respectively.

  • Fig. 6.

    Frequency of occurrence (%) of predicted (x axis) and calculated (y axis) dqr/dt (g kg−1 s−1) whose predictions were made by (a) emulator 1, (b) Kessler (1969), (c) Manton and Cotton (1977), (d) Liu and Daum (2004), (e) Khairoutdinov and Kogan (2000), and (f) Seifert and Beheng (2001) with υ = 0 for the testing dataset. For (b)–(d), the mass of cloud droplets (i.e., r ≥ 1 μm) was used for the calculations. For (e) and (f), masses of both cloud droplets and raindrops were needed, and the separation between the two was set at 40 μm, following Seifert and Beheng (2001). (g) Raw counts of (from left to right) true positive (T+), true negative (T−), false positive (F+), and false negative (F−) of the emulator prediction of the signs of droplet number change rates dNr/dt (kg−1 s−1) for the testing dataset. (h) As in (a), but for dNr/dt. Note the difference in the axis ranges as dNr/dt can be negative: the blue range represents T−, yellow for F−, green for F+, and red for T+. Some weight parameters, whose absolute values are smaller than 10−30, were set to zero in the emulators’ predictions in this figure.

  • Fig. 7.

    (a) Time evolution of domain-mean precipitation rate (mm h−1, solid, left axis) and accumulated precipitation (mm, dashed, right axis) in the simulations with the original Morrison scheme (blue) and the Morrison scheme using the emulators (red) for the shallow convective case. (b) Horizontal distribution of 3-h accumulated precipitation (mm) with the original Morrison scheme, (c) the same with the emulators, and (d) the difference between the two. (e)–(h) As in (a)–(d), but for the simulations using the Thompson scheme.

  • Fig. 8.

    Time evolutions of horizontal mean (a),(b),(g),(h) cloud and (d),(e),(j),(k) rain mass concentrations (g m−3) in (a),(d) the original Morrison, (b),(e) the modified Morrison, (g),(j) the original Thompson, and (h),(k) the modified Thompson simulations, for the shallow convective case. (right) The differences between the left and center columns.

  • Fig. 9.

    As in Fig. 7, but for the supercell case.

  • Fig. 10.

    Time evolutions of horizontal mean concentrations of (a),(b) cloud mass (g m−3), (d),(e) rain mass (g m−3), (g),(h) raindrop number (L−1), (j),(k) ice mass (g m−3), (m),(n) snow mass (g m−3), and (p),(q) graupel mass (g m−3) in (left) the original Morrison simulation and (center) the Morrison simulation with the emulators for the supercell case. (right) The differences between the left and center columns.

  • Fig. 11.

    As in Fig. 10, but for the simulations with the original and modified Thompson schemes.

  • Fig. A1.

    PDFs of input features from the training (blue) and testing (red) datasets. (f)–(t) Samples smaller than 10−30 were all set to 10−30 as the Fortran code was not able to accurately read in smaller values; the numbers on the plots show the percentages of values smaller than 10−30. Note that (d) updraft w and (e) eddy dissipation rate ε show slightly different distributions from those in Figs. 5c and 5d, even though they remain constant throughout a simulation. This is because each of 1000 simulations output different numbers of outputs depending on how long the simulations lasted, and therefore, their PDFs differ slightly from those in Figs. 5c and 5d before the simulations.

All Time Past Year Past 30 Days
Abstract Views 154 154 0
Full Text Views 359 359 50
PDF Downloads 371 371 56