## 1. Introduction

Clouds are a key component of the climate system, with influences on both Earth’s radiative balance and its hydrological cycle. Moreover, the effects and properties of clouds are among the most significant uncertainties in global climate models (e.g., Stephens 2005; Stephens et al. 2010; Flato et al. 2013). To validate cloud models, observational data are needed. Global datasets are most efficiently obtained with satellites, which can provide measurements from most of Earth using the same instrument.

The *CloudSat* satellite (Stephens et al. 2008), operating since 2006, is a platform dedicated to cloud observations, with a 94-GHz cloud radar (Tanelli et al. 2008) as its sole instrument. Using the radar, *CloudSat* has the capability to profile the vertical structure of clouds. Moreover, *CloudSat* orbits in the A-Train constellation along with the *Cloud–Aerosol Lidar and Infrared Pathfinder Satellite Observations* (*CALIPSO*) and *Aqua* spacecraft. Both *Aqua* and *CALIPSO* include instruments that complement the capabilities of *CloudSat*. In particular, the Moderate Resolution Imaging Spectroradiometer (MODIS) on *Aqua* (King et al. 1992) can passively measure various cloud optical properties, while the Cloud–Aerosol Lidar with Orthogonal Polarization (CALIOP) on *CALIPSO* (Winker et al. 2009) can obtain a signal from thin clouds that *CloudSat* cannot detect (Mace and Zhang 2014), and can aid in determination of the cloud phase (Hu et al. 2009).

The cloud water content (CWC), which gives the mass of cloud water per unit volume, is arguably the most fundamental intrinsic property of clouds. It is also produced by climate models, making it a quantity that potentially is readily comparable between models and observations. The CWC can be subdivided into the liquid water content and the ice water content. Although retrievals of ice water content using radar are hindered by the large variety of ice-crystal shapes and structures, cloud liquid water can be modeled in a relatively straightforward manner as spherical scatterers that are small enough for the Rayleigh approximation (van de Hulst 1957) to be valid. Consequently, it is easier to compute the radar cross sections for cloud water than it is for cloud ice. The retrievals in the liquid case are complicated by the fact that CWC is also distinct from the total water content, which also includes precipitating water. The latter is often *not* produced by climate models that have diagnostic precipitation schemes, which instantaneously rain out water that is considered to have precipitated out of the cloud. In contrast, radar measurements are more sensitive to precipitating water than to the cloud water because of the larger drop size, complicating the retrieval of CWC from radar bins in which both cloud and precipitation are present and necessitating data filtering if only cloud water is to be retrieved.

Many spaceborne observations of the column-integrated cloud liquid water path (LWP) exist. These are frequently derived from either passive microwave or visible/near-infrared imagers. These observations agree fairly well in regions of stratocumulus (Bennartz 2007) but differ by approximately a factor of 2 in regions of shallow cumulus (Seethala and Horváth 2010). This disagreement is largely correlated with the presence of precipitation, as observed by *CloudSat*, suggesting that the passive microwave is the more severely limited observation approach (Lebsock and Su 2014). In parallel, Christensen et al. (2013) have shown that *CloudSat*-only retrievals have many biases, including missed detection of clouds and misinterpretation of precipitation water as cloud water. These results suggest that combinations of visible/near-infrared observations with cloud radar observations would allow for retrievals of CWC that combine the integral constraint on the LWP from the visible reflectance with the vertical cloud boundaries and precipitation detection by the radar.

Profiles of CWC provide a constraint on models that is more comprehensive than that provided by the commonplace observations of the cloud water path. In particular, vertical profiles of cloud help to constrain processes related to parameterized mixing in the model that result in the growth of the planetary boundary layer from a shallow well-mixed layer capped by stratocumulus to a deeper decoupled layer populated by cumulus.

In this paper, we identify shortcomings of the current *CloudSat* cloud liquid water content product (“2B-CWC-RVOD”; Wood 2008; Austin et al. 2009) and describe an improved retrieval that combines the *CloudSat* radar reflectivity with MODIS observations of cloud optical depth. Using data-quality filters to ensure that cases with only liquid cloud water are measured, we validate the retrieval results against independent observations. We use the *CloudSat* path-integrated attenuation (PIA) as an independent reference measurement that is closely related to the cloud liquid water path, and we show that the PIA obtained from our retrieval agrees well, in the climatological sense, with the measured PIA. We also examine the effect of various assumptions on the algorithm performance and compare the retrieved LWP with MODIS optical-only retrievals.

## 2. Data

The key data used in the CWC retrieval are the *CloudSat* 94-GHz radar reflectivity and the *Aqua* MODIS cloud optical depth. The radar reflectivity is given in the release-04 *CloudSat* “2B-GEOPROF” data product (Marchand et al. 2008). The cloud optical depth and its uncertainty are included in the MODIS collection-5.1 cloud product (Platnick et al. 2003), which we have collocated to the *CloudSat* column coordinates. Since both *CloudSat* and *Aqua* are on orbit in the A-Train constellation, the collocation can be performed with a time separation of mere seconds. The horizontal resolutions of the instruments are also similar: 1.4 km × 1.7 km for *CloudSat* and 1 km for MODIS. Nearest-neighbor collocation is used, which may result in location offsets that approach 0.5 km. Parallax effects on sample volume for liquid clouds are generally small because the clouds are near the Earth surface. For the purposes of collocation, we assume that MODIS and *CloudSat* sample an equivalent column, and we accept the presence of a collocation error.

We use several other data products to support the algorithm. The cloud phase is obtained from the *CloudSat*–*CALIPSO* “2B-CLDCLASS-LIDAR” product (Sassen et al. 2008), and the *CloudSat* “2C-PRECIP-COLUMN” (Haynes et al. 2009) is used to screen for precipitating clouds. The temperature-dependent water refractive index is computed using the model of Rosenkranz (2015) with *CloudSat*-collocated temperature data from the European Centre for Medium-Range Weather Forecasts (ECMWF) reanalysis, found in the *CloudSat* “ECMWF-AUX” product. The current 2B-CWC-RVOD product (Austin et al. 2009) is also used for comparison purposes.

For validation purposes, we derived the path-integrated attenuation from the normalized surface cross section from the *CloudSat* 2B-GEOPROF product. The ocean surface wind speeds used in this analysis are also derived from the ECMWF reanalysis.

## 3. Algorithm

### a. Retrieval

The algorithm used to retrieve the cloud water content is based on the work of Austin and Stephens (2001), with a number of improvements. We use the same underlying optimal-estimation framework (Rodgers 2000), with some modifications to the formulation of the retrieval.

*r*of cloud droplets. The form of the size distribution

*N*(

*r*) is

*N*

_{T}is the total number concentration,

*r*

_{g}(=exp〈ln

*r*〉) is the geometric mean radius (the angle brackets denote the expected value), and

*σ*

_{log}(=std[ln

*r*], where std[] indicates the standard deviation) is the geometric standard deviation.

*K*

_{w}=

*m*

_{w}is the complex refractive index of water, and

*K*

_{ref}is the value of

*K*assumed in the radar equation (for

*CloudSat*, |

*K*

_{ref}|

^{2}= 0.75). The factor |

*K*

_{w}|

^{2}/|

*K*

_{ref}|

^{2}is needed because at 94 GHz the complex refractive index of water depends significantly on the temperature but this dependence is not accounted for in the

*CloudSat*2B-GEOPROF reflectivity (Tanelli et al. 2008).

*λ*is the wavelength and Im{} indicates the imaginary part of a complex number. For a nadir-pointing radar, the attenuated reflectivity is

*z*is the altitude and

*z*

_{top}is the altitude of the top of the column. In this paper, we denote the logarithmic reflectivity, expressed in units of reflectivity decibels (dB

*Z*), as

*Z*

_{0}= 1 mm

^{6}m

^{−3}.

*τ*is computed by integrating the optical extinction efficiency

*σ*

_{ext,opt}at the geometric optics limit through the column as

*Z*, and make the necessary logarithm base change and unit conversions (all units in this paper are SI unless otherwise mentioned), we get the attenuated logarithmic reflectivity (dB

*Z*):

*N*

_{T}, ln

*r*

_{g}, and

*N*

_{T}and ln

*r*

_{g}will be in the state vector. In this case, the only nonlinearities arise from the attenuation term and from the variation of

*K*

_{w}with altitude. A further benefit of using the logarithmic variables is that the distributions of the variables correspond better to the assumptions of the algorithm. Specifically, the mathematical formulation of the optimal-estimation algorithm assumes that the variables with which it operates are normally distributed. The normal distribution is unbounded, however, and thus allows negative values for quantities such as

*N*

_{T}and

*r*

_{g}, for which they are clearly unphysical. By operating with ln

*N*

_{T}and ln

*r*

_{g}, we assure that the corresponding linear values are always positive. Moreover, by doing this we are effectively assuming that the prior and posterior distributions of these variables are lognormal. Not only is this assumption justified by the central limit theorem (Dekking et al. 2005) for multiplicative quantities, but it has also been found to be close to the truth in nature, in many cases. The lognormal distribution has been found to be a good fit for closely related quantities in many studies since at least the work of Biondini (1976), López (1977), and Houze and Cheng (1977). For the rainfall rate, Kedem and Chiu (1987) also provide theoretical support for this distribution. It is worth remarking that this use of the

*lognormal probability distribution*for the distribution of the retrieved variables is distinct from the use of a

*lognormal size distribution*for the cloud droplets.

*τ*instead of

*τ*. This is given by

*h*, we get an equation that is linear with respect to ln

*r*

_{g}, ln

*N*

_{T}, and

*N*

_{T}is constant in a given column and that

*σ*

_{log}is determined climatologically, using the value of

*σ*

_{log}= 0.38 from Miles et al. (2000). This is in contrast to the current 2B-CWC-RVOD algorithm, which retrieves

*N*

_{T},

*r*

_{g}, and

*σ*

_{log}for each bin, assisted by a priori assumptions. The state retrieved by our algorithm is then

*N*values are retrieved for

*N*+ 1 measurements, making it overly dependent on the a priori values in finding a solution.

*N*

_{T}and

*r*

_{g}, the CWC (denoted as

*l*) can then be computed as

### b. Error estimates

In addition to the input variables, the optimal-estimation algorithm requires error estimates in the form of the covariance matrix of normally distributed random variables. The errors reflect both uncertainties in the instrument and in the measurement forward model.

*τ*into those of ln

*τ*. For this purpose, the following equations (Pal et al. 2006) for the lognormal distribution of a random variable

*x*become useful:

*x*〉 and var[ln

*x*] when 〈

*x*〉 and var[

*x*] are known (here, var[] indicates the variance).

For the MODIS optical depth, the error estimates are available in the MODIS data. Since these errors are for *τ*, we convert to the estimate and error of ln*τ* by interpreting the MODIS optical depth as 〈*τ*〉 and the optical depth uncertainty as std[*τ*] and finding 〈ln*τ*〉 and var[ln*τ*] using Eqs. (14) and (15).

*CloudSat*radar reflectivity, the radar instrument error

*δ*

_{Z,inst}can be constructed as a function of the reflectivity, because it is mainly affected by the radar sampling error and hardware noise. This measurement noise [in decibel (dB) units] can be approximately expressed as

The error in the radar forward model also needs to be considered. In the retrieval, we assume that the lognormal distribution width *σ*_{log} is held fixed. In reality, not only does *σ*_{log} vary, but the form of the cloud droplet size distribution may also deviate from lognormal. Some drizzle may also remain regardless of the data filtering (see section 3d). Thus, the dominant error is not that of the scattering and radiative transfer model but rather is that arising from the uncertainty in the assumptions regarding the droplet size distribution.

While the uncertainty in *σ*_{log} can be treated explicitly, the error due to deviations from the lognormal form (e.g., with multimodal size distributions) is particularly difficult to handle analytically. Instead, we adopted a more empirical approach, analyzing the results from large eddy simulations (LES) that use bin microphysics to simulate the droplet evolution. The LES model (Matheou and Chung 2014), coupled to a bin microphysical model (Suzuki et al. 2010), was used to produce a simulation of shallow precipitating cumulus.

The simulated scenario was from the Rain in Cumulus over the Ocean (RICO) experiment (Rauber et al. 2007), using the composite atmospheric conditions outlined by van Zanten et al. (2011). Further details on the fidelity of the LES model used for simulations of RICO can be found in Matheou et al. (2011). The domain size was 51.2 × 51.2 × 4.0 km^{3} with a spatial resolution of 100 × 100 × 40 m^{3}. The bin microphysics scheme had 30 logarithmically spaced radius bins spanning 3–3000 *μ*m. The radar reflectivity was computed using a Mie-scattering (van de Hulst 1957) model coupled to a time-dependent two-stream radiative transfer code (Hogan and Battaglia 2008).

*CloudSat*radar bin (1.4 × 1.7 × 0.5 km

^{3}). Then, we studied the variability of the reflectivity for near-constant

*N*

_{T}and

*r*

_{g}and by randomly sampling data points and computing the standard deviation of the reflectivities associated with the nearest 10 neighboring points in the

*N*

_{T}–

*r*

_{g}space (the results were overall mostly insensitive to the exact number of nearest neighbors used). The same −15-dB

*Z*reflectivity filtering criterion was used as in the retrieval (see section 3d). The average standard deviation obtained from this analysis (3.05 dB) is used as the forward-model error estimate

*δ*

_{Z,fm}for the reflectivity. Since the forward-model errors and instrument errors can be reasonably assumed to be independent, the total reflectivity error is given by

*CloudSat*2B-CWC-RVOD algorithm also includes an absolute radar calibration uncertainty of 2 dB (Wood 2008), but because the calibration errors are not independent between radar bins they cannot be validly included into the error used on the diagonal of the error covariance matrix. In any case, the forward-model uncertainty is the dominant term in Eq. (17).

The output from the retrieval algorithm also includes an error estimate. The algorithm gives the posterior distributions for ln*N*_{T} and ln*r*_{g,i} for *i* = 1, 2, …, *n*. From these, we can recover the estimated number concentration as 〈*N*_{T}〉 and geometric mean radii of the size distribution as 〈*r*_{g,i}〉, as well as the corresponding error estimates std[*N*_{T}] and std[*r*_{g,i}], with Eqs. (14) and (15).

### c. A priori distributions

As a Bayesian method, optimal estimation requires a priori probability distributions representing the prior information about the retrieved variables. Because optimal estimation assumes Gaussian distributions for the variables, our choice of using the logarithmic variables in the retrieval means that the a priori distributions of *r*_{g} and *N*_{T} are lognormal.

The a priori distributions used in the current 2B-CWC-RVOD algorithm are motivated by the mean and standard deviation values from Miles et al. (2000). The mean values derived from such measurements can be expected to converge reasonably close to the global mean, and, accordingly, we adopted these mean values, 〈*r*_{g}〉 = 6.55 *μ*m and 〈*N*_{T}〉 = 74 cm^{−1}, into our own a priori distributions. We used the values given for marine environments since our primary focus is on clouds over the ocean.

On the other hand, estimates of the standard deviation are strongly dependent on the amount of data present in each sample. The standard deviations listed by Miles et al. (2000, their Table 3) give the variability between the average values from various airborne datasets. These average values are already aggregated from a large number of measurements, and thus they represent sampling times and volumes that are vastly different from those of the *CloudSat* radar. Therefore, we believe that the standard deviations given by Miles et al. (2000) or other studies should not be used directly to constrain radar remote sensing retrievals unless such analyses are specifically designed to estimate the instantaneous variability in a volume the size of the radar bin.

Rigorously analyzing the available microphysical data for standard deviations appropriate for *CloudSat* radar bins would likely require an entire study of its own. In the absence of such analyses, we relaxed the a priori standard deviations considerably from the Miles et al. (2000) values. The retrievals shown in this paper were run using std[*r*_{g}] = 10 *μ*m and std[*N*_{T}] = 150 cm^{−1}, using Eqs. (14) and (15) to obtain the expectation and standard deviation of the logarithm. Sensitivity tests using different values of the standard deviation for the a priori distributions showed that the standard deviation can be increased or decreased considerably without large changes to the distribution of the retrieved values. This indicates that the measurements can effectively constrain the retrieval. In section 5a we further discuss the effects of this choice on the results.

### d. Data filtering

In the current study, we concentrate on validating the liquid water content retrieval rather than building a liquid water climatological database (which will be treated in a later study). Thus, we use fairly strict data-filtering criteria to ensure that the data used do, in fact, originate from nonprecipitating liquid water clouds.

In the analysis, a vertical column of data is considered only if the column

contains at least one radar bin with a reflectivity defined,

has a MODIS optical depth defined,

has no bins classified as a phase other than water in the 2C-CLDCLASS-LIDAR data product (this includes filtering out mixed-phase clouds),

is not classified as precipitating in the 2C-PRECIP-COLUMN product,

has no bins where the radar reflectivity exceeds −15 dB

*Z*in any bin (this is to eliminate drizzle and overhanging precipitation not flagged by 2C-PRECIP-COLUMN), andhas a MODIS solar zenith angle smaller than 45°.

## 4. Validation

### a. Path-integrated attenuation

*σ*

_{abs}is equal to the cloud water content

*l*multiplied by a constant and the factor Im{

*K*

_{w}}, which is nearly constant. Consequently, under these assumptions the two-way microwave path-integrated attenuation (PIA; dB), given as

*CloudSat*2B-GEOPROF-LIDAR) are taken as a subset from the data. The clear-sky surface cross section is calculated as the mean of the observed values within this data segment. The observed PIA due to condensed hydrometeors is calculated as

_{obs}is thus

_{obs}is occasionally negative.

The surface reference technique performs best when the wind speed is sufficiently high to determine the ocean wave structure, and thus the surface radar cross section, effectively. Furthermore, a large enough sample of clear-sky pixels near the cloudy pixels is needed to properly estimate the surface cross section. To ensure that only valid PIA measurements are used, we only use points at which the wind speed is higher than 5 m s^{−1}, the cloud fraction in a sample is smaller than 0.9, and the mean distance from the data point to the reference clear-sky columns is at most 20 *CloudSat* columns. This restricts the availability of reference data significantly, but, with a 4-yr dataset, we were still able to gather a large number (≈670 000) of points with both a valid retrieval and a valid measured PIA.

### b. PIA joint distributions

We compared the PIA estimated from the algorithm forward model (PIA_{mod}) against that observed with the surface reference technique (PIA_{obs}). Figure 1a shows the logarithm of the joint probability distribution function (PDF) of the modeled and observed PIAs (the logarithm of the PDF is shown so as to give better contrast in regions where the value of the PDF is low).

It is clear that significant scatter is present in the joint distribution; there is inevitably noise in both the observed and modeled values. While the noise of PIA_{mod} is difficult to estimate, there is considerable scatter in PIA_{obs} when PIA_{mod} ≈ 0; this gives us an estimate of the standard deviation of PIA_{obs} as approximately 0.30 dB, close to the estimate given for the PIA noise in section 4a. The mean PIA_{obs} is close to zero in the PIA_{mod} ≈ 0 region, indicating that the surface reference estimate of the PIA is nearly unbiased, at least for small LWPs. Note that this near-zero bias is obtained only if the negative PIA values are retained in the dataset; filtering out the negative values would have caused an artificial bias in the mean.

On average, the algorithm achieves an excellent match between the modeled and observed PIAs throughout nearly the full range of values. The agreement in the PIA ranges from 0 to roughly 3.2 dB, covering almost all of the data points, which are heavily concentrated in the low-PIA region.

In Figs. 1b and 1c, we examine the effect of the filtering on the data. The results in those figures show that relaxing either the MODIS solar-zenith-angle filtering or the maximum-reflectivity filtering results in somewhat worse correspondence between PIA_{mod} and PIA_{obs} for the higher values. The bias at higher reflectivities appears to be due to drizzle in the clouds that violates the algorithm assumptions about the size distribution: since drizzle increases *Z*, which is dependent on the sixth moment of the size distribution, the PIA (which is proportional to the third moment) also increases when *τ* (proportional to the second moment) is held constant. In section 4c, we discuss the bias due to the zenith angle.

In Fig. 1d, we make the same comparison using all data filters but using the current *CloudSat* 2B-CWC-RVOD product instead. It is clear from this figure that the new PIA (and consequently the LWP) estimate is an improvement over the currently available product. The algorithm reproduces the average PIA very well up to approximately 0.8 dB, but beyond that the curve flattens, showing that the PIA is often overestimated.

### c. Geographical PIA comparison

In addition to the joint distributions, it is also instructive to examine how the modeled PIA and observed PIA compare in different regions. To this end, Fig. 2 illustrates the bias of the modeled PIA relative to the surface reference observations. The average bias is collected for 5° × 5° grid boxes, and the data are shown if there are at least 200 valid comparisons in a given box (this limit eliminates most overland points, which have few valid PIA observations because of the limitations of the surface reference technique). Figure 2a shows that, while there is a slight consistent negative bias in the PIA, as can be seen also from Fig. 1a, there are no distinctive spatial patterns in the bias, and accordingly the algorithm appears to work robustly throughout the globe.

Figure 2b shows that a slight bias is introduced at the midlatitudes and, to a lesser extent, in the subsident subtropical regions when the reflectivity limit is raised to 0 dB*Z*. A similar, but more distinctive, pattern is seen in Fig. 2c, which demonstrates the need for the MODIS zenith-angle filtering. The PIA bias in this comparison appears to depend only on the latitude. There is no obvious way for *CloudSat* measurements to depend significantly on latitude, because the instrument is active and always nadir pointing. Thus, the most likely explanation is a bias, which manifests at high solar angles, in the MODIS cloud optical depth. The smooth change in the bias as a function of latitude suggests that one could compensate for this error with a relatively simple correction. The bias of the current 2B-CWC-RVOD algorithm, shown in Fig. 2d, is much larger than with our new algorithm; this current algorithm appears to produce large positive biases in all areas, consistent with Fig. 1d.

## 5. Results and discussion

### a. Retrieved distributions

From the analyses in section 4, it is evident that the new algorithm is an improvement over the current 2B-CWC-RVOD product. The origin of the large differences between the two datasets is not immediately clear from those comparisons, however.

More detailed analysis reveals that the 2B-CWC-RVOD retrievals are underdetermined and overly constrained by restrictive a priori distributions in the Bayesian retrieval. In Fig. 3 we demonstrate this by examining the differences at a lower level, in the retrieved microphysical values *r*_{g} and *N*_{T}. Our retrieval is easily seen to permit a much wider range of microphysical values than the current algorithm, which is constrained to a fairly narrow range.

The shape of the global distributions from our algorithm is very close to lognormal for *r*_{g} and *N*_{T}: the Kolmogorov–Smirnov distance (Dekking et al. 2005) between the sample distribution and the corresponding normal distribution is 0.014 for ln*r*_{g} and is 0.010 for ln*N*_{T}. The retrieved distributions are also meaningfully shifted from the a priori distributions of section 3c, giving 〈*r*_{g}〉 = 9.45 *μ*m, std[*r*_{g}] = 2.62 *μ*m, 〈*N*_{T}〉 = 80.7 cm^{−1}, and std[*N*_{T}] = 187 cm^{−1}. The global distribution of *N*_{T} is only slightly different from the prior we used, but that of *r*_{g} has shifted significantly, with a mean 1.4 times the a priori mean and a standard deviation only 0.40 times the a priori value. The shift of the distribution for *r*_{g} shows that the algorithm obtains most of its information from the measurements rather than being constrained by the a priori distribution.

Further evidence that the algorithm is constrained by the measurements can be found in Fig. 4, in which the cloud optical depth *τ*_{mod} modeled from the retrieved values is compared with the MODIS-observed value *τ*_{obs}. The occasional occurrence of MODIS optical depths of *τ*_{obs} = 100, the maximum value permitted by the MODIS retrieval algorithm, is evident from this figure. There also appears to be some deviation from the 1:1 line at high values of *τ*, but in these cases the error of *τ* reported by MODIS is usually very large, giving our algorithm more freedom to choose the appropriate value within the margin of error. Furthermore, although we plot the logarithm of the joint distribution to show where the outliers are, using the logarithm also understates the dominance of small *τ* values in the joint distribution. In fact, over 97% of the distribution weight is in the region with *τ*_{mod} < 30 and *τ*_{obs} < 30, where the correspondence is almost exactly 1:1. The most likely explanation for the systematic difference at high *τ* is the inadequate handling of drizzle, some of which may remain despite the attempts to filter out precipitation. The algorithm would explain the high reflectivity with a large *r*_{g}, which in turn would reduce the optical depth. Improved treatment of drizzle in a later version of the algorithm is expected to alleviate this issue.

### b. Comparison with MODIS LWP

In addition to examining improvements over the current *CloudSat* CWC retrieval, it is also instructive to compare the LWP retrievals with those obtained using only the MODIS instrument. One can find these values in the MODIS version-5.1 cloud product. This comparison is shown in Fig. 5.

Two different comparisons were made because of two issues with the MODIS data. First, the MODIS cloud-phase classification sometimes disagrees with the *CloudSat*/*CALIPSO* cloud-phase product. Second, even when the MODIS product classifies the cloud as liquid, retrievals sometimes yield the maximum permitted values of optical depth (*τ* = 100) and cloud effective radius (*r*_{e} = 30 *μ*m). Figure 5a shows the joint distribution of our retrieved LWP and the MODIS LWP with these data points removed, and Fig. 5b displays the data for all data points at which our CWC algorithm made a valid retrieval. The mean curves (gray lines in Fig. 5) show that, in both cases, our algorithm produces slightly lower LWP values than does the corresponding MODIS LWP retrieval. This difference is more pronounced in the comparison in which the questionable MODIS data points are included. In Fig. 5a, our algorithm consistently produces LWPs that are roughly 25% smaller on average, and Fig. 5b shows a relative difference that is also approximately 25% for MODIS LWP of smaller than 0.2 kg m^{−2} but increases sharply at higher LWPs. We remark that the mean value lines in Fig. 5 are affected by the ordering of the variables: had we instead taken the mean of the MODIS LWP for each *CloudSat*-retrieved LWP, the path of the lines would be different. The correlation coefficients shown in Fig. 5 are independent of the ordering, however, and indicate the same result: the correlation degrades when the questionable data points are added to the comparison.

The comparisons of the retrieved PIA with the observed PIA do not show an offset on the order of 25%, suggesting that the MODIS LWP is overestimated. Because our retrieval is constrained by the MODIS optical depth, this implies an overestimate of the cloud effective radius by the MODIS algorithm. There is, in fact, strong evidence that MODIS overestimates the effective radius when using the 2.1-*μ*m-wavelength retrieval as a result of spatial heterogeneity effects (Zhang and Platnick 2011; Zhang et al. 2012). Painemal and Zuidema (2011) also showed that MODIS effective radius overestimates in situ observations in marine stratocumulus by 15%–20%. We experimented with deriving the LWP instead from the effective radius retrieved from the MODIS 3.7-*μ*m channel (not shown); this reduces the bias, but only by approximately 2%, which suggests that the inhomogeneity does not wholly explain the difference.

It is debatable whether Fig. 5a or Fig. 5b should be considered the more appropriate comparison. A misclassification of the phase by MODIS should not affect the retrieved *τ* severely since *τ* is an optical variable that is not strongly dependent on the cloud phase assumed in the MODIS retrieval algorithm, varying only because the asymmetry factors of ice and liquid particles are different. This means that if a phase disagreement is due to an incorrect classification by MODIS, the optical depth can still likely be used in our retrieval. In contrast, the misclassification would affect the MODIS LWP, since it is a microphysical variable that is derived by using phase-dependent assumptions. A phase misclassification by the *CloudSat*/*CALIPSO* product would affect our retrieval, but the favorable comparison of the PIA values in section 4b suggests that there is not a large influence on the data from such errors. According to Marchant et al. (2016), the MODIS cloud-phase classification algorithm has been rewritten for MODIS Collection 6, with substantial improvements. We plan to make the transition to Collection 6 in the future, which may improve the comparison shown in Fig. 5b. Our retrieval uses only the cloud optical depth, however, which is expected to be largely unaffected by the revision from Collection 5.1 to Collection 6.

## 6. Conclusions

In this paper, we have presented a revised method to retrieve the cloud liquid water content from the combined observations of *CloudSat* radar reflectivity and *Aqua* MODIS cloud optical depth. We found that the current *CloudSat* cloud water content product (2B-CWC-RVOD) is underconstrained and overly dependent on a priori distributions. This problem was addressed by reverting back to and refining the algorithm of Austin and Stephens (2001). We performed the retrieval on the 2007–10 dataset of collocated *CloudSat* and MODIS observations.

The retrieval algorithm is validated by comparing the modeled and observed microwave path-integrated attenuation. The PIA modeled from the retrieved product compares well, on average, to that obtained using the surface reference technique. This indicates that the model accurately determines the column liquid water path, which is nearly linearly related to the PIA. Comparisons with the current *CloudSat* cloud water content product (2B-CWC-RVOD) and with the MODIS LWP suggest that the new algorithm is an improvement over the current *CloudSat* product and that its results also differ from the MODIS LWP.

Because the algorithm that we have presented here performs well with nonprecipitating clouds, it can be extended to a more general algorithm for multi-instrument cloud and precipitation retrievals. Further developments of the algorithm are planned to add precipitating water, ice clouds, and snow into the retrieval framework. The retrieval framework, using the optimal-estimation method, can also be expanded in a fairly straightforward fashion to include additional instruments. These can include additional radar frequencies, lidars, and microwave radiometers. The accuracy of retrievals from other types of hydrometeors can also be validated using the PIA-based technique presented in this work, with the likely exception of ice clouds, which generally do not attenuate 94-GHz radiation appreciably. For these, various methods that are methodologically compatible with our approach already exist (e.g., Austin et al. 2009; Delanoë and Hogan 2010).

The new CWC algorithm is generic and can be applied to other observations besides those from the A-Train. For example, the Earth Clouds, Aerosol and Radiation Explorer (EarthCARE; Illingworth et al. 2015), a collaboration between the European Space Agency and the Japan Aerospace Exploration Agency, is scheduled to launch in 2018. The EarthCARE instrumentation includes a 94-GHz radar, a cloud–aerosol lidar, and a multispectral imager. Thus, the algorithm presented here is directly applicable to EarthCARE data, either as a whole or as potential improvements to the retrieval algorithms currently under development. It can also be used for ground-based measurements that provide the necessary data.

We show the mean (excluding zero values) retrieved LWP in Fig. 6. While this figure demonstrates that the algorithm can operate over all regions of Earth, it also raises an important caveat regarding the data at the current stage. We adopted fairly strict data-filtering criteria to ensure that the algorithm was, in fact, operating on columns of cloud liquid water. These criteria should eliminate precipitation and ice clouds from the validation studies presented in this paper. However, the filters also introduce climatological sampling bias since the data points do not include cloud water in the presence of precipitation, overlaid by ice clouds, or in mixed-phase clouds. For example, increasing the reflectivity threshold to 0 dB*Z* increases the total number of samples by 26%. As such, the retrieved data are, in their current stage, only suited for climatological studies when they are compared with data where the same filtering criteria are applied. For model–observation comparisons, such filters can be implemented by using a radar simulator to model the reflectivity values. Data from our retrievals can also be valuable for comparisons in specific regions, such as those dominated by marine stratocumulus, where ice and precipitation rarely occur. Furthermore, even when the maximum reflectivity limit is raised from −15 to 0 dB*Z*, the average error of the algorithm is modest for all but the largest liquid water paths. Thus, the algorithm accuracy may be acceptable also in the presence of light precipitation. Our near-term development goal is to further improve the ability of the algorithm to function in the presence of precipitation, thus permitting the use of the algorithm as the basis of a global vertically resolved CWC climatological description.

## Acknowledgments

We are grateful to Simone Tanelli and Joseph Hardin for valuable discussions. The research of JL, ML, and GS was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under contract with NASA and was supported by the *CloudSat* project.

## REFERENCES

Austin, R. T., and G. L. Stephens, 2001: Retrieval of stratus cloud microphysical parameters using millimeter-wave radar and visible optical depth in preparation for

*CloudSat*: 1. Algorithm formulation.,*J. Geophys. Res.***106**, 28 233–28 242, doi:10.1029/2000JD000293.Austin, R. T., A. J. Heymsfield, and G. L. Stephens, 2009: Retrieval of ice cloud microphysical parameters using the CloudSat millimeter-wave radar and temperature.

,*J. Geophys. Res.***114**, D00A23, doi:10.1029/2008JD010049.Bennartz, R., 2007: Global assessment of marine boundary layer cloud droplet number concentration from satellite.

,*J. Geophys. Res.***112**, D02201, doi:10.1029/2006JD007547.Biondini, R., 1976: Cloud motion and rainfall statistics.

,*J. Appl. Meteor.***15**, 205–224, doi:10.1175/1520-0450(1976)015<0205:CMARS>2.0.CO;2.Cho, H.-M., and Coauthors, 2015: Frequency and causes of failed MODIS cloud property retrievals for liquid phase clouds over global oceans.

,*J. Geophys. Res. Atmos.***120**, 4132–4154, doi:10.1002/2015JD023161.Christensen, M. W., G. L. Stephens, and M. D. Lebsock, 2013: Exposing biases in retrieved low cloud properties from CloudSat: A guide for evaluating observations and climate data.

,*J. Geophys. Res. Atmos.***118**, 12 120–13 131, doi:10.1002/2013JD020224.Dekking, F. M., C. Kraaikamp, H. P. Lopuhaä, and L. E. Meester, 2005:

*A Modern Introduction to Probability and Statistics: Understanding Why and How.*Springer, 488 pp.Delanoë, J., and R. J. Hogan, 2010: Combined

*CloudSat*–*CALIPSO*–MODIS retrievals of the properties of ice clouds.,*J. Geophys. Res.***115**, D00H29, doi:10.1029/2009JD012346.Flato, G., and Coauthors, 2013: Evaluation of climate models.

*Climate Change 2013: The Physical Science Basis*, T. F. Stocker et al., Eds., Cambridge University Press, 741–866.Greenwald, T. J., 2009: A 2 year comparison of AMSR-E and MODIS cloud liquid water path observations.

,*Geophys. Res. Lett.***36**, L20805, doi:10.1029/2009GL040394.Haynes, J. M., T. S. L’Ecuyer, G. L. Stephens, S. D. Miller, C. Mitrescu, N. B. Wood, and S. Tanelli, 2009: Rainfall retrieval over the ocean with spaceborne W-band radar.

,*J. Geophys. Res.***114**, D00A22, doi:10.1029/2008JD009973.Hogan, R., and A. Battaglia, 2008: Fast lidar and radar multiple-scattering models. Part II: Wide-angle scattering using the time-dependent two-stream approximation.

,*J. Atmos. Sci.***65**, 3636–3651, doi:10.1175/2008JAS2643.1.Houze, R. A., Jr., and C.-P. Cheng, 1977: Radar characteristics of tropical convection observed during GATE: Mean properties and trends over the summer season.

,*Mon. Wea. Rev.***105**, 964–980, doi:10.1175/1520-0493(1977)105<0964:RCOTCO>2.0.CO;2.Hu, Y., and Coauthors, 2009:

*CALIPSO*/CALIOP cloud phase discrimination algorithm.,*J. Atmos. Oceanic Technol.***26**, 2293–2309, doi:10.1175/2009JTECHA1280.1.Illingworth, A. J., and Coauthors, 2015: The EarthCARE satellite: The next step forward in global measurements of clouds, aerosols, precipitation, and radiation.

,*Bull. Amer. Meteor. Soc.***96**, 1311–1332, doi:10.1175/BAMS-D-12-00227.1.Kedem, B., and L. Chiu, 1987: On the lognormality of rain rate.

,*Proc. Natl. Acad. Sci. USA***84**, 901–905, doi:10.1073/pnas.84.4.901.King, M. D., Y. J. Kaufman, W. P. Menzel, and D. Tanre, 1992: Remote sensing of cloud, aerosol, and water vapor properties from the Moderate Resolution Imaging Spectrometer (MODIS).

,*IEEE Trans. Geosci. Remote Sens.***30**, 2–27, doi:10.1109/36.124212.Lebsock, M., and H. Su, 2014: Application of active spaceborne remote sensing for understanding biases between passive cloud water path retrievals.

,*J. Geophys. Res. Atmos.***119**, 8962–8979, doi:10.1002/2014JD021568.Lebsock, M., T. S. L’Ecuyer, and G. L. Stephens, 2011: Detecting the ratio of rain and cloud water in low-latitude shallow marine clouds.

,*J. Appl. Meteor. Climatol.***50**, 419–432, doi:10.1175/2010JAMC2494.1.López, R. E., 1977: The lognormal distribution and cumulus cloud populations.

,*Mon. Wea. Rev.***105**, 865–872, doi:10.1175/1520-0493(1977)105<0865:TLDACC>2.0.CO;2.Mace, G. G., and Q. Zhang, 2014: The

*CloudSat*radar-lidar geometrical profile product (RL-GeoProf): Updates, improvements, and selected results.,*J. Geophys. Res. Atmos.***119**, 9441–9462, doi:10.1002/2013JD021374.Marchand, R., G. G. Mace, T. Ackerman, and G. Stephens, 2008: Hydrometeor detection using

*Cloudsat*—An Earth-orbiting 94-GHz cloud radar.,*J. Atmos. Oceanic Technol.***25**, 519–533, doi:10.1175/2007JTECHA1006.1.Marchant, B., S. Platnick, K. Meyer, G. T. Arnold, and J. Riedi, 2016: MODIS Collection 6 shortwave-derived cloud phase classification algorithm and comparisons with CALIOP.

,*Atmos. Meas. Tech.***8**, 1587–1599, doi:10.5194/amt-9-1587-2016.Matheou, G., and D. Chung, 2014: Large-eddy simulation of stratified turbulence. Part II: Application of the stretched-vortex model to the atmospheric boundary layer.

,*J. Atmos. Sci.***71**, 4439–4460, doi:10.1175/JAS-D-13-0306.1.Matheou, G., D. Chung, L. Nuijens, B. Stevens, and J. Teixeira, 2011: On the fidelity of large-eddy simulation of shallow precipitating cumulus convection.

,*Mon. Wea. Rev.***139**, 2918–2939, doi:10.1175/2011MWR3599.1.Miles, N. L., J. Verlinde, and E. E. Clothiaux, 2000: Cloud droplet size distributions in low-level stratiform clouds.

,*J. Atmos. Sci.***57**, 295–311, doi:10.1175/1520-0469(2000)057<0295:CDSDIL>2.0.CO;2.Painemal, D., and P. Zuidema, 2011: Assessment of MODIS cloud effective radius and optical thickness retrievals over the southeast Pacific with VOCALS-REx in situ measurements.

,*J. Geophys. Res.***116**, D24206, doi:10.1029/2011JD016155.Pal, N., C. Jin, and W. K. Lim, 2006:

*Handbook of Exponential and Related Distributions for Engineers and Scientists.*Chapman and Hall/CRC, 339 pp.Platnick, S., M. King, S. Ackerman, W. Menzel, B. Baum, J. Riedi, and R. Frey, 2003: The MODIS cloud products: Algorithms and examples from

*Terra*.,*IEEE Trans. Geosci. Remote Sens.***41**, 459–473, doi:10.1109/TGRS.2002.808301.Rauber, R. M., and Coauthors, 2007: Rain in shallow cumulus over the ocean: The RICO campaign.

,*Bull. Amer. Meteor. Soc.***88**, 1912–1928, doi:10.1175/BAMS-88-12-1912.Rodgers, C. D., 2000:

*Inverse Methods for Atmospheric Sounding—Theory and Practice*. Series on Atmospheric, Oceanic and Planetary Physics, Vol. 2, World Scientific, 256 pp., doi:10.1142/9789812813718.Rosenkranz, P. W., 2015: A model for the complex dielectric constant of supercooled liquid water at microwave frequencies.

,*IEEE Trans. Geosci. Remote Sens.***53**, 1387–1393, doi:10.1109/TGRS.2014.2339015.Sassen, K., Z. Wang, and D. Liu, 2008: Global distribution of cirrus clouds from

*CloudSat*/*Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations*(*CALIPSO*) measurements.,*J. Geophys. Res.***113**, D00A12, doi:10.1029/2008JD009972.Seethala, C., and Á. Horváth, 2010: Global assessment of AMSR-E and MODIS cloud liquid water path retrievals in warm oceanic clouds.

,*J. Geophys. Res.***115**, D13202, doi:10.1029/2009JD012662.Stephens, G. L., 2005: Cloud feedbacks in the climate system: A critical review.

,*J. Climate***18**, 237–273, doi:10.1175/JCLI-3243.1.Stephens, G. L., and Coauthors, 2008:

*CloudSat*mission: Performance and early science after the first year of operation.,*J. Geophys. Res.***113**, D00A18, doi:10.1029/2008JD009982.Stephens, G. L., and Coauthors, 2010: Dreary state of precipitation in global models.

,*J. Geophys. Res.***115**, D24211, doi:10.1029/2010JD014532.Suzuki, K., T. Nakajima, T. Y. Nakajima, and A. P. Khain, 2010: A study of microphysical mechanisms for correlation patterns between droplet radius and optical thickness of warm clouds with a spectral bin microphysics cloud model.

,*J. Atmos. Sci.***67**, 1126–1141, doi:10.1175/2009JAS3283.1.Tanelli, S., S. L. Durden, E. Im, K. S. Pak, D. G. Reinke, P. Partain, J. M. Haynes, and R. T. Marchand, 2008:

*CloudSat*’s cloud profiling radar after two years in orbit: Performance, calibration, and processing.,*IEEE Trans. Geosci. Remote Sens.***46**, 3560–3573, doi:10.1109/TGRS.2008.2002030.van de Hulst, H. C., 1957:

*Light Scattering by Small Particles*. J. Wiley and Sons, 470 pp.van Zanten, M. C., and Coauthors, 2011: Controls on precipitation and cloudiness in simulations of trade-wind cumulus as observed during RICO.

,*J. Adv. Model. Earth Syst.***3**, M06001, doi:10.1029/2011MS000056.Winker, D. M., M. A. Vaughan, A. Omar, Y. Hu, K. A. Powell, Z. Liu, W. H. Hunt, and S. A. Young, 2009: Overview of the

*CALIPSO*mission and CALIOP data processing algorithms.,*J. Atmos. Oceanic Technol.***26**, 2310–2323, doi:10.1175/2009JTECHA1281.1.Wood, N., 2008: Level 2B radar-visible optical depth cloud water content (2B-CWC-RVOD) process description document. NASA Tech. Rep., 26 pp. [Available online at http://www.cloudsat.cira.colostate.edu/sites/default/files/products/files/2B-CWC-RVOD_PDICD.P_R04.20081023.pdf.]

Zhang, Z., and S. Platnick, 2011: An assessment of differences between cloud effective particle radius retrievals for marine water clouds from three MODIS spectral bands.

,*J. Geophys. Res.***116**, D20215, doi:10.1029/2011JD016216.Zhang, Z., A. S. Ackerman, G. Feingold, S. Platnick, R. Pincus, and H. Xue, 2012: Effects of cloud horizontal inhomogeneity and drizzle on remote sensing of cloud droplet effective radius: Case studies based on large-eddy simulations.

,*J. Geophys. Res.***117**, D19208, doi:10.1029/2012JD017655.