## Abstract

This paper describes the daytime cloud optical and microphysical properties (DCOMP) retrieval for the Pathfinder Atmosphere’s Extended (PATMOS-x) climate dataset. Within PATMOS-x, DCOMP is applied to observations from the Advanced Very High Resolution Radiometer and employs the standard bispectral approach to estimate cloud optical depth and particle size. The retrievals are performed within the optimal estimation framework. Atmospheric-correction and forward-model parameters, such as surface albedo and gaseous absorber amounts, are obtained from numerical weather prediction reanalysis data and other climate datasets. DCOMP is set up to run on sensors with similar channel settings and has been successfully exercised on most current meteorological imagers. This quality makes DCOMP particularly valuable for climate research. Comparisons with the Moderate Resolution Imaging Spectroradiometer (MODIS) collection-5 dataset are used to estimate the performance of DCOMP.

## 1. Introduction

The Pathfinder Atmosphere’s Extended (PATMOS-x) dataset is an atmospheric climate dataset generated by the National Atmospheric and Ocean Administration (NOAA). It derives atmospheric and surface parameters from nearly 30 years of observations taken by NOAA’s Advanced Very High Resolution Radiometer (AVHRR), located on the various Polar Operational Environmental Satellite spacecraft. AVHRR sensors are also included on the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) MetOp series slated to continue until 2020. Therefore, the AVHRR record should span over 40 years when completed, making it one of the most valuable satellite-based data records for cloud climatological descriptions. One of the more critical components of PATMOS-x is the daytime cloud optical and microphysical properties (DCOMP) algorithm, which generates estimates of cloud optical thickness, cloud effective radius, and ice/water path during daylight conditions.

Cloud optical thickness and the particle size distribution can be used to describe almost completely the radiative properties of a cloud in the solar spectral region. It is therefore difficult to overestimate the importance of their spatial and temporal variations to the studies of climate change and to the evaluation of climate models. Particle size distribution can adequately be expressed as the cloud effective radius, which is defined by the ratio of the integral over all droplet volumes to the integral over all droplet surface areas.

The use of visible and near-infrared observations to estimate cloud optical thickness and effective radius goes back several decades (Hansen and Pollack 1970). These techniques were demonstrated using aircraft measurements (Foot 1988; Twomey and Cocks 1982). Satellite retrievals of these properties for limited cases began not long after these measurements were available from satellites (Arking and Childs 1985; Nakajima and King 1990; Platnick and Valero 1995). The first global survey of these properties over a limited time range was provided by Han et al. (1994). Heidinger (2003) describes the initial application of these techniques to a long-term global AVHRR radiance dataset. Roebeling et al. (2006) derived these properties from a geostationary imager for climate applications. Descriptions about the application of these techniques to Moderate Resolution Imaging Spectroradiometer (MODIS) observations are given by Platnick et al. (2003) and Minnis et al. (2011a). DCOMP builds upon the long heritage of this retrieval technique by attempting to apply it to multiple sensors in a physically consistent manner to generate cloud climate data records that are useful to the community.

DCOMP was developed with support from the NOAA Geostationary Operational Environmental Satellite—R Series (GOES-R) Algorithm Working Group (AWG) to be the official algorithm for the Advanced Baseline Imager (ABI). Descriptive technical details for the DCOMP algorithm for GOES-ABI are provided in the corresponding algorithm technical basis document (ATBD; Walther et al. 2011). During this project, the algorithm was designed to run on the Spinning Enhanced Visible and Infrared Imager (SEVIRI) on Meteosat Second Generation satellites, MODIS, and the current GOES imagers to be tested with proxy data. The algorithms differ only in spectral channels; the underlying assumptions, forward models, and retrieval techniques are invariant. This quality can be particularly valuable for intercomparison studies or for the generation of climatological time series and will be shown in section 4. The spectral information provided by the AVHRR is limited relative to that of other sensors, such as the GOES-R ABI, but the AVHRR instruments do provide the needed spectral observations to run the DCOMP algorithm.

DCOMP is written as a generic algorithm tool that is applicable to all sensors that measure any one of the three channel pairs (0.6–1.6, 0.6–2.2, and 0.6–3.75 *μ*m). The sensors that have been used to date with DCOMP are AVHRR, SEVIRI, GOES, MODIS, and the Advanced Along-Track Scanning Radiometer. Being based on optimal estimation, DCOMP makes use of the uncertainty estimates of the input parameters and the forward model and then propagates these uncertainties to estimates for the retrieved parameters. It is written modularly to allow for easy exchange of algorithm parts as warranted for improved performance or for sensitivity studies. In addition, the overall philosophy of DCOMP is to make full use of the best available auxiliary data as opposed to the use of static climatological estimates. For example, DCOMP uses profiles of ozone (O_{3}) and water vapor (H_{2}O) taken from numerical weather prediction (NWP) datasets and employs high-spatial-resolution estimates of surface reflectance and surface emissivity. The sources of auxiliary data are listed in Table 1. DCOMP also incorporates state-of-the-art scattering models during the generation of the lookup tables described below.

The goal of this paper is to demonstrate the specific application of DCOMP to the AVHRR and to validate its performance relative to an accepted standard. This paper will also point out the strengths and weaknesses of these properties derived from the AVHRR in the context of PATMOS-x and thereby facilitate their proper usage as climate data records.

## 2. Forward model

The retrieval is based on simultaneous measurements of a visible channel with conservative scattering and a shortwave infrared (IR) weakly absorbing channel. The method is based on previous work (Nakajima and King 1990; King 1987) and has been demonstrated in a number of remote sensing studies (e.g., Nakajima and Nakajima 1995; Roebeling et al. 2006). In this section, we will briefly outline the techniques and approximations used in simulating the reflectance. This process is referred to as the forward model. Following sections will show how this forward model is used in retrieval and will demonstrate the performance of DCOMP in comparison with other datasets. Full technical details are provided in the DCOMP ATBD (Walther et al. 2011).

### a. Basic considerations

The reflectance *R*(*θ*_{0}, *θ*, Δ*ϑ*) is a function of incoming and outgoing zenith angles *θ*_{0}, *θ*, and the relative azimuth angle Δ*ϑ*. It is defined as the ratio of reflected light radiance *L*(*θ*_{0}, *θ*, Δ*ϑ*) to the incoming radiance represented in the atmosphere by spectral solar irradiance at the top of atmosphere *F*_{0}(*θ*_{0}) multiplied by *μ*_{0} (the cosine of solar zenith angle: cos*θ*_{0}):

With the addition of a lower reflecting surface, the channel reflectance for the case of a scattering cloud at the top of atmosphere is

where *R _{c}* is the cloud-reflectance function,

*R*

_{TOA}is the apparent reflectance at the satellite level,

*A*is the albedo of an underlying Lambertian surface,

*t*is the cloud-transmittance function in the downward and upward directions,

_{c}*τ*is the cloud optical thickness,

*r*is the effective radius, and

_{e}*S*is the spherical albedo of a cloud. This equation is valid only for fully cloud covered pixels. We neglect to write the angular dependencies for the remainder of this article for better readability. Cloud transmission for the incoming and outgoing path will be written as

*t*

_{c}_{,0}and

*t*, respectively.

_{c}Radiative transfer models are capable of computing the functions *R _{c}*(

*τ*,

*r*),

_{e}*t*(

_{c}*τ*,

*r*), and

_{e}*S*(

*τ*,

*r*) under the assumption of a plane-parallel homogeneous cloud. The surface albedo

_{e}*A*is assumed to be known. Thus, Eq. (2) has two unknowns and should be solvable when applied to a system of two sufficiently orthogonal observation channels.

Equation (2) neglects thermal source terms from the atmosphere and the surface and other atmospheric processes above and below the cloud, however. The latter include scattering processes by air molecules and aerosol, as well as gaseous absorption, and have to be corrected. The atmospheric correction is applied to the measured radiance and can be decoupled from the inversion scheme and performed for layers above and below the cloud separately if the cloud altitude is known. In essence, it adjusts the measured reflectance at the sensor above the top of the atmosphere to that which would be measured at the top of the cloud and adjusts the surface albedo to account for extinction below the cloud. After transforming *R*_{TOA} to the apparent reflectance at the top of cloud *R*_{TOC} and introducing effective surface albedo *A*_{υ}, we have

The left-hand side of this equation is the measured top-of-the-atmosphere reflectance corrected for above-cloud transmission and scattering. The atmospheric-correction scheme will be described in more detail in section 2b. The forward-model calculations do not have to consider atmospheric correction, and they simulate top-of-cloud reflectance with an extinction-free atmosphere between cloud and surface.

If DCOMP is applied to sensor channels around 3.8 *μ*m (such as AVHRR channel 3b, SEVIRI channel 4, GOES channel 2, or MODIS channel 20), terrestrial emission must be taken into account. The term *R*_{TOC} in Eq. (3) depicts the solar reflectance. For these channels, however, the measured radiance is a sum of backscattered solar radiation and a terrestrial-emission part.

The latter is the top-of-cloud radiance *I*_{TOC} that is emitted from the surface and atmosphere. This can be expressed as

where *I _{a}* is the radiance contribution from layers above the cloud height

*H*,

*I*

_{clr}is the clear-sky radiance emitted by the surface,

*ɛ*is the cloud emissivity,

_{c}*t*is the cloud transmission, and

_{c}*B*(

*T*) is the Planck function at the cloud-top temperature

_{c}*T*. Cloud emissivity

_{c}*ɛ*(

_{c}*τ*,

*r*) and cloud transmission

_{e}*t*(

_{c}*τ*,

*r*) are computed by the radiative transfer model and stored in lookup tables. They depend on the state vector so that they are optimized during the retrieval loop. The other unknown variables in Eq. (4) are obtained from auxiliary data as follows. The new National Centers for Environmental Prediction (NCEP) Climate Forecast System Reanalysis (CFSR; Kalnay et al. 1990; Saha et al. 2010; Sela 1980) provides realistic atmospheric profiles of temperature and water vapor on global regular 0.5° and 2.5° grids from 1979 onward. We employ a fast IR radiative transfer code, the pressure-layer fast algorithm for atmospheric transmittances (PFAAST; Hannon et al. 1996), with the input from these reanalyses, combined with surface emissivity values obtained from a MODIS-based database (Seemann et al. 2008), to compute assumed clear-sky radiance and transmission profiles.

_{e}Cloud-top temperature and cloud-top height are obtained from the AWG cloud-height retrieval algorithm (ACHA), which is also part of the PATMOS-x retrieval scheme, in which it runs in advance of DCOMP.

The terrestrial radiance *I* can subsequently be transformed into an equivalent reflectance value *R _{e}* as follows:

where *d* is the Earth–sun distance in astronomic units and *F*_{0} is the channel-specific solar constant normalized to the average Earth–sun distance using the Kurucz solar irradiance database (Kurucz 1995). The solar part of the measured reflectance is extracted by

Through including the terrestrial emissivity observed at top of cloud, Eq. (3) becomes an equation for each measurement channel:

This is in fact a general equation for any channel. For the range from visible to near infrared, the amount of radiance that comes from surface or atmospheric emission is negligible, and *R _{e,}*

_{TOC}becomes zero.

Now all formulations are prepared to proceed to the inversion problem, which aims to find the pair of cloud properties (*τ*, *r _{e}*) that satisfies Eq. (7). At the end of this process the retrieval will find a pair of the cloud properties (

*τ*,

*r*) that satisfy the equations. The problem is not analytically solvable because of the complicated nature of radiative transfer. The solution is retrieved by an iterative optimization process that will be explained in more detail in section 3.

_{e}### b. Atmospheric correction

Atmospheric correction in DCOMP is done in a two-level scheme, separated into above-cloud and below-cloud corrections. The first part handles all processes above the cloud to compute the reflectance that would be observable at the top of the cloud.

The reflectance measured at the sensor *R*_{TOA} is transformed to a cloud-top reflectance *R*_{TOC} as

where *t _{a}* is the two-way atmospheric transmission above the cloud and

*R*

_{SC}represents all single-scattered components from the atmosphere above the cloud, that is, including the photons that are scattered before or after reflection at the cloud top. It is only assumed to be nonzero for Rayleigh scattering in the visible channel around 0.6

*μ*m where we use the scheme introduced for MODIS by Wang and King (1997). We use a full-column atmosphere Rayleigh optical depth of 0.044. The actual Rayleigh optical depth of the air column above the cloud is then computed with cloud pressure, assuming that the Rayleigh optical depth scales linearly with pressure. Aerosol scattering is neglected in DCOMP, since the knowledge about quantity is very uncertain and the assumed impact for most cases is low. The transmission

*t*is calculated by considering scattering and absorbing processes in the visible channel and from absorption for both channels.

_{a}Atmospheric effects below cloud are taken into account by virtually decreasing the surface albedo. We assume that the impact from scattering extinction below the cloud is low. The virtual surface albedo parameter *A*_{υ} is defined from the surface albedo *A* as

where *t _{b}* is the two-way atmospheric transmission between the cloud and surface. Since we assume a diffuse radiation field below the cloud, we compute the transmission for the zenith path between cloud and surface (air mass is equal to 2). The surface albedo is taken from the MODIS white-sky albedo dataset (Moody et al. 2008). DCOMP uses the 16-day climatological values averaged from a 5-yr period (2000–04).

Significant contributors to the extinction include absorption by water vapor, stratospheric ozone, and other atmospheric gases and scattering in the visible channel by air molecules and aerosol. Water vapor and ozone exhibit considerable temporal and spatial variability in the atmosphere. Using a default value for the entire globe would lead to considerable error. Ozone amount can vary by ±50% (from about 200 to 350 Dobson units), and water vapor can vary by more than an order of magnitude. We use NCEP reanalysis data as a best estimate for ozone and water vapor profiles instead of climatological averages. This dataset provides profiles of amount of ozone and water vapor on a 2.5° grid.

Atmospheric transmission values for ozone and water vapor are computed by simplified algorithms that are based on forward simulations. Version 4 of the Moderate Resolution Atmospheric Radiance and Transmission Model and Code (MODTRAN v4; Berk et al. 1998) was used to compute regression coefficients of ozone and water vapor transmission as a function of the absorber amount *M*. The regression scheme for gas transmission *T* is expressed as

Table 2 provides the coefficients *a* for water vapor and ozone for four current sensors. The unit of *M* for ozone is the Dobson unit and the unit for water vapor is millimeters in a column. We included standard values of the amount of other gases in the simulations of water vapor transmission so that carbon dioxide, methane, and so on are also taken into account. The basis for the simulations was 1000 realistic atmospheric observations extracted from the Thermodynamic Initial Guess Retrieval database (Chevallier et al. 2000). Those data cover a wide range of atmospheric conditions. For instance, given the geometrical variations of DCOMP measurements, a high number of viewing and solar zenith angles were used in the MODTRAN simulations. As a result, we used 5000 different situations to obtain three coefficients from linear regression analysis.

The majority of the ozone is in the stratosphere and thus above the cloud. Vertical variability can therefore be neglected. Absorption by ozone occurs primarily in the visible channel. The uncertainty in the transmission is low. The source of the ozone absorber amount is the NWP results from the Global Forecast System model reanalysis (Sela 1980). This dataset is updated every 6 h and is interpolated to the time of observation. An uncertainty of less than 5% leads to an expected absolute error in transmission values for a thin ozone layer (at 280 Dobson units, corresponding to a transmission value of 0.98 for an airmass factor of 2) of ~0.01 and for a thick ozone layer of ~0.008.

Water vapor correction is more uncertain for two reasons. First, water vapor is much more variable in space and time. Second, because water vapor exists in the troposphere, we have to use cloud-height information together with an assumed vertical profile to extract the total water vapor above the cloud. Cloud height is obtained from ACHA, which runs before DCOMP in the common processing scheme (Heidinger et al. 2010). Water vapor transmission error may have an additional error that comes from the uncertainty in the cloud height.

The bottom row of Fig. 1 shows the transmission coefficients for water vapor and ozone for the example case. The true-color image and the cloud-top pressure and cloud-type results as retrieved by the PATMOS-x ACHA retrieval that are shown in the top row of Fig. 1 support the discussions. Eight of the 12 cloud types distinguished in the PATMOS-x dataset are present in the top-right panel. The four reddish and yellow color classes (overshooting: maroon, overlaying cirrus: red, cirrus: orange, and ice: yellow) depict cloud pixels for which DCOMP applies ice-phase tables. All others require the liquid-water-phase table. Water vapor transmission is mainly a function of cloud-top pressure (blues and greens are lower pressure) if the water vapor is homogenously distributed. In contrast to ozone transmission, uncertainty in cloud-top pressure leads to a higher uncertainty in water vapor transmission. Water vapor transmission is in a range from approximately 0.97 (blues) to 1 (reds) for the visible channel at 0.6 *μ*m and is between 0.8 (blues) and 1 (reds) in the shortwave IR channel at 3.75 *μ*m. Ozone transmission ranges from approximately 0.9 (blues) at the edge of the swath to 0.95 (greens) for minimal air mass in the center.

### c. Generation of lookup tables

The previous section describes how the cloud-reflectance and cloud-transmittance values are used in conjunction with atmospheric and surface terms to simulate the total atmospheric signals used in DCOMP. The actual forward model used during the DCOMP processing is based on lookup tables (LUTs) in which the cloud reflectance, cloud transmittance, cloud spherical albedo, and cloud emissivity are precomputed. These LUTs serve as the basis for the forward model during the inversion process. They were designed to cover the entire range of possible conditions as functions of microphysical properties and the geometrical constellation. Because the scattering characterizations of water and solid particles differ greatly, the LUTs are generated for ice- and water-phase clouds separately. DCOMP decides between the phases using the PATMOS-x cloud-type algorithm (Pavolonis et al. 2005).

A doubling–adding forward radiative transfer model (RTM) simulates the cloud-reflectance function *R _{c}*, the cloud transmission

*t*, and the cloud spherical albedo

_{c}*S*for a single-layered plane-parallel homogeneous cloud as a function of optical depth and effective radius, as well as geometrical constellation as expressed by solar and observation zenith and azimuth angles. It requires single-scattering phase functions for water and ice particles. Water particles were assumed to be spherical droplets at all wavelengths, and Mie scattering (see, e.g., Wiscombe 1980) was assumed for the inference of scattering and absorption properties.

For a given droplet size distribution and optical constants of water, a Mie code returns extinction and scattering coefficients and the scattering phase function, which describes the angular distribution of scattered light in a single-scattering event. The droplet size distribution is approximated by the modified gamma–Hansen function, which is determined by the effective radius and the dispersion *r _{a}* about the effective radius and normalization constant. The dispersion is set to 0.1. The size range of the distribution is limited in the computations here to 0–80

*μ*m.

The situation for ice clouds is more complex because the particles are generally not spherical. In fact, the assumption of spherical particle shapes for the ice phase leads to substantial errors (Mishchenko et al. 1996). Ice clouds are assumed to be composed of a mixture of habits consisting of droxtals (primarily for the smallest particles in a size distribution), hollow and solid columns, plates, 3D bullet rosettes, and aggregates. The hollow bullet rosettes also have significantly different scattering/absorption properties than those of solid bullet rosettes. The current distribution of habits as a function of particle size is prescribed by Baum et al. (2005).

Using LUTs for forward-model calculations causes an error due to interpolation assumptions, which would not be necessary with direct use of the radiative transfer model. The latter is not feasible, though, because it would considerably increase computation time. It is also clear that the more computations are stored in the LUTs the lower the interpolation error is but also the higher the computer memory usage is. We conducted an analysis in which LUTs of different sizes were compared with results directly from a radiative transfer model without any interpolation. In the end, we found an LUT design for which the maximal error due to linear interpolation is less than 1% in the outputs of the RTM for all four parameters (reflectance, spherical albedo, transmission, and emissivity). A consequence of this optimization is that the retrieval becomes less sensitive to the interpolation scheme. The resulting LUTs have 45 equally binned entries for sensor and solar zenith from 0° to 88°, 45 angles with finer spacing in the backward direction for relative azimuth difference (with 5° steps between 0° and 170° and 1° steps between 170° and 180°), 29 cloud-optical-depth entries from −0.6 to 2.2 evenly distributed in log_{10} space, and nine effective radii from 0.4 to 2.0 in log_{10} space as well. Log_{10} space was used for the LUT optical-depth and effective-radius dimensions because reflectance and transmittance are more linear in log_{10} space than in linear space. The increased linearity in log_{10} space reduced the number of optical depths and effective radii required.

## 3. Retrieval method

Optimal estimation (OE) has been proven to be an accurate inversion technique for deriving properties from satellite measurements. It employs prior knowledge of likely solutions and measurement error and forward-model error. A mathematical description of the OE technique can be found in Rodgers (2000). The OE techniques were widely used in connection with cloud algorithms (e.g., Sayer et al. 2011; Watts et al. 2011).

The basis of OE rests on the assumption of Gaussian error statistics for background and observation error estimates. This leads to the requirement of a preferable Gaussian distribution of the state vector **x**. Distributions of optical depth and effective radii are often skewed, however, with a high number of low values and a wide range of possible high values. To account for this, we transform the state-vector elements in logarithmic space with a decimal base for the inversion. The other benefit of using logarithmic transformation rather than linear spacing is that the variation with reflectance is more linear, which will lead to a more-Gaussian error distribution of the input vector.

Optimal estimation basically seeks to find a solution for the state vector **x** with the minimum average error over some classes of estimators. In DCOMP these estimators are the comparison of the measurement vector **y**, a two-element vector with visible and shortwave IR channel reflectance, with the output forward model *F*(**x**, **b**) and the comparison of **x** with prior knowledge **x**_{a} under consideration of their uncertainties. The OE for a nonlinear problem, such as we have in DCOMP, requires an iterative search for the solution. The OE aims to minimize the cost function *J*, accounting for measurement errors and prior knowledge, and is of the following form:

where **x**_{a} is the a priori state vector, **b** is the forward-model parameter vector, and and are the error covariance matrixes of the observation and the a priori, respectively. The forward-model parameter vector **b** is not retrieved, but its uncertainties have to be considered in the retrieval. The forward operator *F*(**x**, **b**) aims to calculate the observation vector **y** from an atmospheric state vector **x** and forward-model parameter vector **b** according to Eq. (7):

Although not retrieved, the forward-model parameters defined in **b** have an impact on the retrievals, and their uncertainties have to be taken into account. The forward-model vector elements in DCOMP are identified as the surface albedo, scattering properties, surface emissivity, and the water vapor profiles used to compute the terrestrial radiance.

Both the measurement and the prior knowledge are quantities with assumed uncertainties that are described by covariance matrices. If we consider the a priori assumption as a virtual measurement, then the relative weight of the uncertainty, on the one hand, of the a priori assumption and, on the other hand, of the measurement error together with forward-model error determines the solution. In the extreme case there is no trust in the measurements at all ( becomes very small), and the retrieval has its minimal cost value at the a priori values. In this case the retrieval has no information content at all and our confidence in the solution is as large as our a priori knowledge. In contrast, if we do not have any trustworthy a priori knowledge, the solution is reached when forward calculations and observation are in the range of the observation covariance.

A proper implementation of DCOMP requires meaningful estimates of a priori values of the possible state of the atmosphere **x**_{a} and their uncertainties described by . The latter is a matrix for which each diagonal element is the assumed error variance of each element **x**_{a}. These uncertainties are assumed to be independent from each other in DCOMP so that the off-diagonal elements can be set to zero.

The prior independent knowledge of cloud optical depth (COD) and cloud effective radius (REF) is limited. Because almost all estimates of COD and REF from other algorithms are based on similar approaches, there is no robust a priori value except an approximate range for all clouds over the globe. In general, it is assumed that water clouds have a maximum effective radius size of up to 40 *μ*m, with an average of about 10 *μ*m. Bigger liquid cloud particles would start to precipitate and are unrealistic in the atmosphere. Ice particles may have effective radii of up to 100 *μ*m, with a higher average than water. According to the findings of several publications (e.g., Han et al. 1994; Parol et al. 1991), we set the a priori values for the water effective radius to 10 *μ*m (1.0 in log_{10} space) and for ice to 18 *μ*m (1.3 in log_{10} space). The uncertainty of these assumptions is high. Therefore, we set the standard deviations for the a priori of COD and REF to 100%, which corresponds approximately to 1.0 in log_{10} space. The a priori for optical depth is set to 10 for both phases with an uncertainty of 1.0 in log_{10} space (approximately 100%). These settings account for the weak prior knowledge of these values.

We did not explicitly account for errors that occur when our assumption of a single-layer plane-parallel cloud that fully spans the pixel is violated. These errors include the presence of 3D effects and multilayer and/or mixed-phase clouds. It is also important to note that these error sources also affect many of the existing climatological databases on which we based our a priori values. We accounted for these effects empirically by increasing our uncertainty of the a priori constraints beyond what the climatological datasets tell us. We did not attempt to account for these errors in our forward-model uncertainty estimates, nor can we be certain of their Gaussian behavior. Therefore, our error estimates are only valid in the context of our assumed cloud.

As described above, the observation vector is represented by the reflectance values at the top of cloud. Thus, the uncertainty in **y** is determined by the uncertainty of the measurement at the satellite level itself, with consideration of channel noise and calibration error, and the uncertainty of the atmospheric correction above the cloud. The uncertainties of each atmospheric-correction component are set in an error covariance matrix for atmospheric-correction components . This matrix is added to the error covariance matrix of the observation vector as

where denotes the error covariance of the measurement including channel noise and calibration. The dimensions of the error covariance matrix are *n* × *n*, where *n* is equal to the number of components of atmospheric correction considered. It has nonzero values only on the main diagonal and consists of the variance of the assumed uncertainty of water vapor amount, ozone amount, and cloud-top height. The kernel operator describes the sensitivity of **y** to changes in the components of atmospheric correction and has to be determined for every pixel. The kernel has the dimensions *m* × *n*, with *m* equal to the number of observation channels. The resulting errors in both channels are correlated because uncertainty in water vapor amount has an impact on transmission in both channels. Values of are set as relative error (in percent).

According to (Rodgers 2000), the forward-model errors can be added to the observation error covariance as

where is the error covariance of the measurement, is the observation covariance employed during the iteration process, and is the error covariance matrix of the forward model. The kernel describes the sensitivity of **y** to changes in the forward-model parameters. The sensitivity of the forward-model parameters may differ for each atmospheric state so that varies at each iteration during the optimization.

The error covariance matrix of the forward-model parameters is set by the uncertainty of the elements of **b**. It is clear that the uncertainty of the surface characterizations is highly variable globally and seasonally. As an example, the surface albedo of a homogenous surface without considerable vegetation is supposed to be better known than the surface albedo of a fresh-snow ground layer. We estimated the quality of our surface knowledge by spatial and temporal homogeneity tests of the surface data. In addition, we set a high value of uncertainty for snow and sea ice.

The matrix determines the sensitivity of the forward model. The elements include the partial derivatives of the forward model with respect to the forward-model parameters.

The corresponding elements of regarding the surface albedo uncertainty are, according Eq. (12),

This equation shows that the impact of surface-albedo uncertainty on the retrieval is higher the thinner the cloud is. The partial derivative of the emissivity element ∂*F*(**x**)/∂*ɛ* is not calculated analytically but is found by applying the PFAAST model with varying emissivity. The uncertainty and values have to be computed at each iteration step since state vector **x** varies. We estimated these uncertainties in **y** caused by the radiative transfer simulations to be 5% of the computed reflectance and added it directly to .

We already mentioned forward-model errors, which cannot be described in the same manner. The estimate of is only correct if the forward-model error is relatively small. Figure 2 illustrates the observation uncertainties for the same example case as in Fig. 1. The top-row images show the elements of the observation error covariance matrix determined in the last iteration before the solution was found. The left image in Fig. 2 shows expected error in reflectance values in the visible channel (VIS). As stated before, the error covariance matrix includes forward-model errors as well. In general we assume that the forward-model error for ice clouds is higher because of higher uncertainty in ice-habit distribution and hence in phase function. This is even clearer in the shortwave (near-) IR (NIR) channel, for which the atmospheric-correction error is small. The correlation is particularly high for low clouds, for which it is more likely that water vapor has an impact above cloud transmission in both channels, and the other factors are low.

Once the retrieval is completed, we use the error covariance of the state to determine the uncertainty of the solution **x**. The uncertainty in both products may be correlated, but this is not necessarily the case. As an example, the visible channel has low information over very bright snow or sea ice surfaces. This leads to a high uncertainty of cloud optical depth but not necessarily to higher uncertainty in the REF product. The correlations are stored in the off-diagonal elements of .

The bottom row of Fig. 2 shows the solution error covariance matrix results, similar to those of . The corresponding maps of COD and REF are shown in Figs. 5 and 6, which are described in section 5 below. We transformed the diagonal elements, which are in log_{10} space during retrieval, into measures of relative error for better interpretability. The expected error in COD reaches values of more than 50% for thick ice clouds. Liquid clouds have an error from about 10% over large areas. The error in REF shows large differences between thick ice clouds with greater droplet size (top-left image in Fig. 6, described below). The solution uncertainties are weakly correlated.

## 4. Uncertainty considerations

This section seeks to make use of the uncertainty propagation of DCOMP to understand the importance of individual components of the retrieval. We use in the following also the term “error” as a synonym of uncertainty. In this sense the error should not be seen as a measure of the deviation from the truth but as an expectation of the value range.

The solution error covariance values provide measures of the total error budget for both products of the retrieval. Whereas this represents the estimated error in physical units of the parameters, we seek in this section to identify the contributions of individual retrieval components to the total error variance.

Figure 3 illustrates the transformation of the observation vector into the retrieved state vector, with corresponding error estimates. The solid black lines show the channel reflectance as a function of COD (upright isolines) and REF (horizontal isolines). The errors are illustrated as ellipsoids, which increase in size at each step. First, the reflectance at the top of the atmosphere is transferred to reflectance at the top of the cloud, with corresponding uncertainty ellipses used in measurement error covariance matrices for two examples.

Figure 4 shows the relative error of COD and REF as functions of the properties themselves separated for ice (solid lines) and water (dashed lines) phases. Liquid-phase cloud properties have in general lower uncertainty, mainly as a result of less uncertainty in phase function in the forward-model calculations. The uncertainty in optical depth is lowest (under 5%) for midthick clouds between 5 and 20. Thick clouds show high errors in COD [more than 8% (20%) for water (ice) clouds with optical thickness greater than 80] because of saturation effects. Thin clouds are affected by surface-albedo uncertainty. The uncertainty in COD is also sensitive to the particle size, although it is not clear from these plots whether the occurrence of small cloud droplets is correlated with thin clouds.

The bottom-left image of Fig. 4 illustrates a thin-cloud problem of the effective radii retrieval. The expected performance of REF is low for thin clouds, with an uncertainty of 20% for water clouds and 40% for ice clouds.

A different simplified way to evaluate an error budget is to view the total variance of the retrieval as the sum of a set of variances created independently by different uncertainty components. This includes some difficulties for DCOMP, because the components may be correlated. For a rough estimate we will apply this method here, however.

For DCOMP the following components can be identified and their magnitudes can be studied. The magnitude of the relative contributions of the different error components for an example scene from DCOMP-MODIS is summarized in Table 3.

The individual components are described in detail below:

Sensor noise and calibration error in the visible channel at 0.6

*μ*m are considered—these are random errors purely introduced by sensor characteristics and calibration method. We assume that channel noise (signal-to-noise ratio is 128 for MODIS) and calibration error sum up to about 4%. The uncertainties are assumed to be uncorrelated between both channels. For this particular scene, the main amount of the overall uncertainty (three quarters) in the cloud optical thickness comes from the uncertainty in the channel reflectance.Sensor noise and calibration error in the shortwave IR channel at 3.75

*μ*m are also considered.Surface albedo in the visible channel is considered. The uncertainty of surface albedo is estimated by analyzing local spatial and temporal standard deviations of the MODIS white-sky albedo (Roman et al. 2009). The uncertainty is higher in regions and seasons with highly variable vegetation or over fresh snow surfaces. The uncertainty of surface albedo has a large impact on the error budget of the effective radius. It can be explained by a high number of thin clouds in this particular scene, for which small changes in the visible reflectance have a large impact on effective radius.

Surface albedo in the shortwave IR channel has a smaller impact than does the surface albedo in the visible channel. The reason for this is the lower transmittance of the cloud because of water absorption.

Water vapor and cloud-top pressure uncertainty estimates have only a small impact on the retrieval errors.

Ozone absorption is only considered in the visible channel, so that the relative importance for COD is expected.

Uncertainty of clear-sky transmittance includes the uncertainty of the NWP temperature and humidity profiles, as well as the surface emissivity.

RTM uncertainty includes the uncertainty in the computation of cloud-reflectance, cloud-transmittance, and spherical-albedo functions, as well as the error in the interpolation scheme in the lookup tables. The uncertainty of 5% is a rough estimate.

Because the uncertainty estimates of the input data and the corresponding uncertainty values depend on an individual scene, they have to be interpreted carefully and should not be seen as a general estimate of DCOMP error. We intend to publish a further article that will contain a deeper analysis of uncertainty values. With regard to the relative weights of the different errors, however, a few general conclusions can be made:

The error budget of COD is dominated by the error of channel noise and sensor calibration of the visible channel. It accounts for three-quarters of the total error budget of COD.

The only other significant error sources in the COD budget are the uncertainty in surface albedo in the visible channel and ozone absorption. At 10% and 4.5%, these values are low in comparison with the channel noise and calibration error, however.

The error budget of REF is spread among the individual contributions to a greater degree. Uncertainties of both reflectance measurements are around 15%. The impact of VIS surface albedo uncertainty (at 32%) is surprisingly high, especially when compared with the contribution of the shortwave IR albedo (with only 7%). This result is caused by a high number of thin clouds in this scene. Transmission in the VIS channel is higher relative to longer-wavelength channels so that the impact is also bigger according to Eq. (15).

COD retrieval seems to be robust against uncertainties of atmospheric and surface characterizations. More-exact auxiliary data or improved atmospheric-correction methods will not significantly decrease the uncertainty of the COD product. It is much more important to decrease the calibration uncertainty. In contrast, it is worth improving the accuracy of several components for a better retrieval result for REF.

These are only preliminary considerations. A more-thorough analysis should include the separation of different cloud scenarios, such as thick and thin clouds; low, middle, and high clouds; surface (land, sea, and snow); and individual regions. These error estimates can only be a reliable measure if the forward-model assumptions are true within the assumed error range. As an example, if the atmosphere is not a one-layer cloud atmosphere or the cloud phase is falsely determined, the error is much bigger and cannot be estimated with this technique. Errors that are caused by wrong assumptions of the forward model cannot be analyzed in the same way by using solution error covariance. Examples of this kind of error can be wrong assumptions of cloud phase or the number of cloud layers in the column.

## 5. Performance estimate

We already stated that one advantage of the DCOMP scheme is that it is able to run on several sensors. This section seeks to show that the results are consistent for some current sensors. For this purpose, we picked two example scenes with a good spatiotemporal match for MODIS on *Aqua*, *NOAA-18*, and *GOES-11* (for a scene over the North Pacific Ocean) and for MODIS on *Terra*, *MetOp-A*, and SEVIRI (for the second scene, which is off the Namibian coast). We compare these results with the output of the retrievals of the MODIS Atmospheres Science Team, which also generates estimates of cloud optical and microphysical properties using the algorithms referred to as MYD06 for the *Aqua* satellite and MOD06 for the *Terra* satellite (Platnick et al. 2003). The MODIS products have been used extensively in the published literature and are regarded as climate-quality products. Although they cannot be regarded as an entirely independent data source for DCOMP validation because DCOMP is based upon the same sensor as well as on a similar bispectral retrieval approach, it is very valuable to use these products for evaluation of DCOMP, especially because a pixel-based comparison is possible.

The first scene is located off the Californian coast around 2200 UTC on yearday 218 in 2006. We resampled all data to a regular 0.05° grid, which corresponds to approximately 5 km, to include AVHRR and *GOES-11* in the comparison. The time shift between the sensors does not exceed 10 min for all pixels. Channels used for this example are channels 1 and 2 for *GOES-11*, channels 1 and 3b for AVHRR on *NOAA-18*, and channels 1 and 20 for MODIS on *Aqua*. This corresponds to the 0.6/3.8-*μ*m channel combination, which has a terrestrial radiance component to be taken into account. Observation zenith angles are different for the three sensors.

Figures 5 and 6 show the mapped results for optical thickness and effective radius, respectively. The general patterns are very similar, especially for COD. The MYD06 product shows fewer cloudy pixels, presumably because of rejection of cloud-edge pixels for microphysical retrieval in the MODIS collection-5 dataset.

Figure 7 shows the pixel-by-pixel comparisons of DCOMP on MODIS retrieval with the MYD06 products, with DCOMP on *GOES-11*, and with DCOMP on AVHRR *NOAA-18* as scatterplots. Note the logarithmic scaling of the number of occurrences for each histogram bin. The COD comparison of DCOMP on MODIS with MYD06 exhibits an almost perfect match, with a correlation factor of 0.97 and a bias of less than 1. The corresponding comparison of REF has less correlation but still shows good agreement, with a bias of 0.63 *μ*m and a correlation of 0.89. One possible source of this bias is the simulation of the 3.75-*μ*m reflectance. MYD06 uses a thermal subtraction method (Nakajima and Nakajima 1995) to estimate the 3.75-*μ*m reflectance. This method ignores thermal emission above the cloud and makes other assumptions that differ from the approach used by DCOMP. Also, systematic differences in the assumed cloud-top temperature could cause these discrepancies. Both DCOMP and MYDO6 rely on upstream algorithms to provide this information. We conclude from this analysis that DCOMP and MYD06 algorithms, when using the 3.75-*μ*m channel, agree to within the expected levels of uncertainty present in these complex physical retrievals.

The images for the comparison with the other sensors show much more scatter but still have low bias values. Imperfect temporal and spatial matching between the data of different sensors may cause this. The REF comparisons show some more differences such as some small-particle detections of DCOMP MODIS for which, in contrast, AVHRR and MYD06 have values up to 30 *μ*m.

Figure 8 illustrates the results for comparisons of PATMOS-x on MODIS/*Terra* with MOD06 and SEVIRI over the South Atlantic Ocean (the coordinate range for longitude is from 0° to 20°W, and for latitude it is from 30°S to 0°) for the 0.6/1.6-*μ*m channel combination (SEVIRI channels ⅓; MODIS channels ⅙). Again, there is good agreement for COD comparisons, although one issue appears here clearly—in particular, for the MOD06 image. There are two scatter areas with correlated pixels outside the one-to-one line; those are assumed to be wrongly detected because of incorrectly classified cloud phase. Although the comparison of COD with SEVIRI is good, the effective radius retrieved from SEVIRI is too low in comparison with MODIS (bias of 2.23 *μ*m).

Although these issues require more effort for improvement, it has also been shown that DCOMP is in general able to provide consistent instantaneous results for different sensors with varying observation geometries. Figure 9 seeks to analyze whether these results are also robust in time. Both time series use 2 yr of DCOMP data from MODIS *Aqua* and AVHRR *NOAA-18* for a subregion of the North Pacific (longitude from −140° to −120° and latitude from 15° to 35°) and compare them with MYD06 products. All three datasets were retrieved around 1330 local time. We computed the daily average over all valid observations and plotted them with dot symbols in Fig. 9. The solid line depicts the 16-day smoothing average over these daily averages. The optical thickness shows a surprisingly good fit between the datasets. In contrast, the REF time series exhibits a permanent bias between PATMOS-x MODIS and MYD06 (~2–3 *μ*m) and PATMOS-x AVHRR (~−1 *μ*m).

The discrepancy between the two MODIS-based retrievals is surprising since the pixel-based comparisons seem to be in a good agreement. Although the results have not been thoroughly checked, we assume that the differences of the sampling of optically thin clouds in MYD06 and DCOMP and the dominance of the far tail of the REF distribution in MYD06 cause the differences in the averaged products. This is also in agreement with the findings of Minnis et al. (2011b), who also compared their cloud products with MYD06.

These intercomparisons do not, of course, provide quantitative assessments of the true accuracy.

## 6. Conclusions

This article was primarily written to provide a reference for a substantial part of the long-term climate time series of PATMOS-x. We have revealed all significant components of the algorithm. Although the general approach is not unique, DCOMP introduces some new methods for atmospheric correction and in the inversion scheme. The validation and comparison studies proved that the results are in good agreement with data from other observations. DCOMP is capable of processing a number of current and future sensors. The results seem to be consistent for all sensors.

The software and processing technique might also be of interest. All DCOMP retrievals have one common FORTRAN90 software code, which is embedded in the Clouds from AVHRR Extended (CLAVR-x) software framework. This is a fast retrieval package able to run large datasets for climate purposes in a relatively short time. As an example, DCOMP on one full-disk *GOES-11* image with approximately 16 million pixels takes about 16 s on a Dell, Inc., R515 workstation. This enables us to generate and provide climate data on short notice. CLAVR-x also provides DCOMP with all needed sensor and auxiliary data. It recognizes from which sensor the level-1b data come and assigns the appropriate lookup tables to the retrieval scheme. The LUT design is identical for each sensor. Thus, the only difference for individual sensors lies in the population of the tables and the gaseous transmission coefficients with respect to the different filter functions. All other methods described in this article are identical.

In this way, DCOMP is able to combine the advantages of the different observation geometry and coverage. Whereas polar-orbiting sensors provide good global coverage, the diurnal cycle can be sufficiently observed by the SEVIRI or GOES geostationary satellites. Also, DCOMP provides a full error propagation, which leads to physically based uncertainty estimates of the final products.

Future activities should extend the error-budget discussions to individual situations. This will help to identify the really significant uncertainty sources, which may help to improve DCOMP by choosing the best data and uncertainty measures. Examples are the inclusion of more highly resolved (0.5°) NWP from the CFSR, which has recently become available from 1979 to the present. Other potential improvements are the consideration of higher uncertainty in singularities in the phase function, such as rainbow peaks in the radiative transfer.

## Acknowledgments

The authors are grateful to Ping Yang and Shougou Ding of Texas A&M University for preparing and providing the lookup-table data for ice clouds. The work was supported by the GOES-R AWG program of NOAA/NESDIS. The authors appreciate the very valuable comments from three reviewers. The views, opinions, and findings contained in this report are those of the authors and should not be construed as an official NOAA or U.S. government position, policy, or decision.