The naive Bayesian methodology has been applied to the challenging problem of cloud detection with NOAA’s Advanced Very High Resolution Radiometer (AVHRR). An analysis of collocated NOAA-18/AVHRR and Cloud–Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO)/Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) observations was used to automatically and globally derive the Bayesian classifiers. The resulting algorithm used six Bayesian classifiers computed separately for seven surface types. Relative to CALIPSO, the final results show a probability of correct detection of roughly 90% over water, deserts, and snow-free land; 82% over the Arctic; and below 80% over the Antarctic. This technique is applied within the NOAA Pathfinder Atmosphere’s Extended (PATMOS-x) climate dataset and the Clouds from AVHRR Extended (CLAVR-x) real-time product generation system. Comparisons of the PATMOS-x results with those from International Satellite Cloud Climatology Project (ISCCP) and Moderate Resolution Imaging Spectroradiometer (MODIS) indicate close agreement with zonal mean differences in cloud amount being less than 5% over most zones. Most areas of difference coincided with regions where the Bayesian cloud mask reported elevated uncertainties. The ability to report uncertainties is a critical component of this approach.
While cloud remote sensing is a field that is decades old, achieving the levels of accuracy needed for climate applications remains a challenge. Particularly, confident statements about decadal trends and variability of global cloudiness are still missing (Foster et al. 2010). In addition, as identified in the fourth Intergovernmental Panel on Climate Change (IPCC) report, much of the variability among climate predictions is driven by uncertainties in the response of cloudiness to climate change. This highlights the need to produce accurate, long-term cloud records to assist in determining cloud feedback sensitivities. The longest global remotely sensed cloud imagery record is from the Advanced Very High Resolution Radiometer (AVHRR), located on the National Oceanic and Atmospheric Administration (NOAA) polar-orbiting satellites. The 5-channel AVHRR record begins in 1981, while the 4-channel record goes back to 1978. Though the approaches developed here have been applied to the entire record, only the data generated since 1981 with the 5-channel observations have been thoroughly analyzed.
The length of the AVHRR record makes it uniquely suited to address questions of multidecadal cloud variability. However, the relative age of the sensor makes for a source of uncertainty. For instance, compared to newer satellite sensors like the Moderate Resolution Imaging Spectroradiometer (MODIS) the AVHRR has fewer channels and relatively coarse spatial resolution. It also lacks an onboard calibration device for its visible to near-infrared channels. Integrating the longer AVHRR record with its shorter but more accurate and data-dense counterparts has been used to reduce this uncertainty (Heidinger et al. 2010).
This paper describes the application of probabilistic techniques for cloud detection using AVHRR data. Specifically, the naive Bayesian approach is used. The information used to compute the Bayesian information comes from periods of coincidence between the NOAA-18/AVHRR and the Cloud–Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO)/Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) lidar data. NOAA-18 flies in an orbit that is similar to the Earth Observing System (EOS)/Aqua A-train orbits. As will be shown later, the frequency and global occurrence of these collocations is sufficient to derive the Bayesian cloud-detection classifiers.
2. Data used in this study
CALIOP is a two-wavelength lidar with polarization sensitivity (Winker et al. 2009, 2010). CALIOP measures two orthogonally polarized components of the 532-nm backscatter and the total intensity of the 1064-nm backscatter. CALIPSO flies in the EOS A-train at an altitude of 550 km and a nominal equator crossing time of 1330 local solar time (LST). The diameter of the lidar footprint is approximately 100 m. The physical separation between each CALIOP footprint (center to center) is 335 m. Three lidar footprints are averaged to obtain the 1-km cloud product. The level-2 products from CALIOP provide data with resolutions of 333 m, 1 km, and 5 km. The coarser the horizontal resolution of the dataset, the more averaging of individual lidar profiles is performed to increase the sensitivity to optically thin cirrus and aerosol layers. For this application, we chose the 1-km cloud-layer product. This product provides the geometric boundary and the midlayer temperature of up to 10 layers of cloud in every 1-km profile. The CALIPSO data used were version 3.01 and were provided by the National Aeronautics and Space Administration (NASA) Langley Research Center Atmospheric Science Data Center.
The afternoon Polar Environmental Satellite (POES) NOAA-18 flies in an orbit with a similar equator crossing time but in a higher orbit—720 km. Therefore, NOAA-18 flies slower relative to the surface of the earth than does CALIPSO. Because of these differences in orbit, the ground tracks of NOAA-18 and CALIPSO periodically fall in and out of alignment. Therefore, multiple days of data are required to obtain global coverage. The AVHRR data used here are the Global Area Coverage (GAC) data in the level-1b 10-bit format provided by NOAA/National Environmental Satellite, Data, and Information Service (NESDIS). The AVHRR data were processed using the NOAA Pathfinder Atmosphere’s Extended (PATMOS-x) processing system using the reflectance calibration described in Heidinger et al. (2010) and the thermal calibration described in Rao et al. (1993).
3. Collocation of AVHRR and CALIPSO
A key component of this analysis is the ability to collocate the AVHRR with CALIPSO. To accomplish this, a routine was developed to find the AVHRR pixel that was closest in distance to each 1-km CALIPSO cloud-layer pixel. This routine employed a nearest-neighbor approach coupled with a polynomial fit to provide initial estimates of collocated pixels. The resulting computations require only seconds to generate the collocations for an entire GAC orbit. Figure 1 illustrates the data from one case where the orbits of NOAA-18 and CALIPSO align. The image in the upper left shows the cross section of the 532-nm CALIOP backscatter. The image in the lower left shows the 1-km cloud-layer (CLay) product for this data. The image on the right shows the 11-μm brightness temperature of the AVHRR. The solid white line represents the CALIPSO path overlaid onto the AVHRR image.
As stated above the CALIPSO/CALIOP cloud-layer product provides up to 10 layers of cloudiness. For the purposes of this study, a pixel was considered cloudy if one or more layers of cloud were detected. The cloud fraction for each 1-km CALIOP pixel is assumed to be 0 and 1.0. Since the AVHRR GAC pixel size is roughly 5 km, a 5-km cloud fraction was computed from the 5 CALIOP pixels that were closest to the center of the AVHRR GAC pixel. Therefore, the cloud fractions from CALIOP were constrained to have the values of 0, 0.2, 0.4, 0.6, 0.8, or 1.0.
The collocation described above was applied to all NOAA-18/AVHRR GAC and 1-km CALIPSO 1-km cloud-layer data in 2007. Figure 2 shows the global and viewing geometry distributions of the collocated AVHRR–CALIPSO data during 2007. As Fig. 2 shows, the data cover all latitudes and longitudes though the distributions of latitude show distinct peaks near the North and South Poles. Most of the data occur with zenith angles less than 20°. Data are available at all solar zenith angles though solar angles away from the terminator are more prevalent. Ideally, the distributions of the training data would cover all viewing conditions. However, given the orbits of CALIPSO, terminator conditions cannot be viewed outside of the polar regions. To account for this the cloud-detection metrics, described later, are chosen to be insensitive to the viewing angles wherever possible.
4. Naive Bayesian formulation
In the full or classical Bayesian approach (Uddstrom et al. 1999; Merchant et al. 2005), the probability of a given passive satellite pixel being cloudy for a set of features F is given by P(Cyes|F) defined as
where P(Cyes) is the prior probability of any pixel being cloudy without any knowledge of F and P(F) is the probability of existence of the pixel’s set of features F. The term P(F|Cyes) is the probability of the existence of the pixel’s set of features for the cloudy pixels. The components in the feature set F are referred to as the cloud mask classifiers and the particular features employed in this approach are described in section 6. The Bayesian context, P(F|Cyes) is referred to as the posterior probability.
One issue with the classical Bayesian approach is that the use of N classifiers requires the computation of N × N dimensioned arrays holding the class conditional probabilities. In most cloud-detection approaches, several tests are required to fully detect all types of cloudiness in visible–infrared imagery. To put this in perspective, while the CALIPSO–AVHRR collocation process described above resulted in over 5 million pixels for the 12-month period studied, that number is not sufficient to fully populate the N × N space required for the cloud mask classifiers used in this algorithm especially given the need to compute these classifier distributions for several surface types. To overcome this we have employed a naive Bayesian approach. In the naive Bayesian assumption, each of the feature probabilities can be treated as independent, and the value of P(Cyes|F) can be rewritten as follows:
The denominator in the above equation P(F) is computed as
where P(Cno) is the prior probability of any pixel being clear [P(Cno) = 1 − P(Cyes)].
The obvious advantage of this method is the use of N classifiers (or cloud mask tests) requires generation of N, not N × N, dimensioned arrays. While this approximation may seem severe, naive Bayesian approaches have been applied successfully to many complex detection problems (Kossin and Sitkowski 2009).
To generate the clear and cloudy classifier distributions, three filters were applied to the data. First, memory limitations required a thinning of the data by a factor of 2. Second, only collocations that occurred with a time difference of less than 10 min were used and this reduced the total number to 6 678 802. To avoid the increased uncertainty in the collocation process for subpixel cloudiness, only pixels where the 5-km CALIPSO cloud fraction was either 0 or 1.0 were included and this filter reduced the number of collocations down to 5 708 524. Next, the classifiers were computed separately for different regions or surface types. The selection of these surface types is discussed below. Figure 3 shows an example set of clear and cloudy classifier distributions computed for one classifier (Tmax − T) over the deep-ocean surface type. The Tmax − T classifier is described in the next section. The clear and cloudy distributions are normalized to unity for clarity of presentation. Also shown in Fig. 3 is the posterior probability as a function of the classifier. For illustrative purposes Fig. 3 assumes only the use of one classifier while in the full approach all six classifiers are used in Eq. (2). Figure 3 does illustrate one of the key strengths of the Bayesian approach in that the probability of cloud varies smoothly over the range of the classifier. In threshold-based techniques, the probability distributions are assumed to jump from 0 to 1 when passing over the chosen threshold.
Table 1 provides sample values for the estimation of the posterior probability given in Eq. (3). The results are given for a particular surface type (shallow ocean) and for each of the six classifiers. The determination of the surface types and classifiers is discussed in the next sections. The class conditional no [P(Cno)] and class conditional no [P(Cyes)] values coupled with the knowledge of the Prior No and Prior Yes values are all that are required to generate the final posterior probability value (0.87) for this pixel. The posterior probabilities computed for each class individually are shown for reference only. The chosen pixel was a observed during daytime conditions. Therefore, the night 4-μm classifier was turned off using the procedure described in section 6.
5. Selection of surface types
As stated above, the selection of different surface types to generate the classifiers is critical. We have chosen to classify the globe into seven surface types. The goal of classifying different surface types is to capture the systematic biases in our knowledge of the clear-sky conditions that vary greatly from one surface type to another. In the current algorithm, we classify the globe into the following surface types: 1—deep ocean, 2—shallow water, 3—land, 4—snow, 5—Arctic, 6—Antarctic, and 7—desert. These surface types were chosen after a series of trial and error experiments. Each surface type represents a region where the distribution in the contrast between clear and cloudy skies and the accuracy of the performance of the clear-sky model is similar. The inputs to the surface type are the land cover data from the land cover database used in the MODIS geolocation file (“MOD/MYD03”), the snow field within the National Centers for Environmental Prediction (NCEP) reanalysis (Kalnay et al. 1996), the NOAA Optimum Interpolation Sea Surface Temperature Version-2 (OISST) daily 25-km SST analysis (Reynolds et al. 2002), and 3.75-μm surface emissivity from the Seeman–Borbas (SEEBOR) surface emissivity database (Seemann et al. 2008). Figure 4 shows the global distribution of these surface types for 1 January and 1 July 2009. A brief description of these types follows. The surface types will vary with the frequency of the ancillary data. While the land cover data are temporally invariant, the surface emissivity values vary every 16 days. The largest driver of the surface type variation is the snow and ice cover information.
The sea ice information is taken from the OISST data and varies daily. The snow information is taken from the NCEP reanalysis, which is updated every 6 h.
The deep-ocean surface type consists of pixels where the MOD03 land mask was set to “deep ocean” and the sea ice information from the OISST data indicated ice-free conditions. Highly accurate clear-sky radiative transfer modeling and spatially uniform surfaces characterize the deep-ocean surface type.
The shallow-water surface type is defined by ice-free pixels that the MOD03 land mask classified as moderate ocean, deep inland water, and shallow inland water. In addition, any pixels where the 3 × 3 standard deviation of the background SST from the OISST exceed 1.0 K were also included in the shallow-water surface type. In general, this surface type includes water bodies where our knowledge of the surface temperature is much less accurate than that of the deep-ocean surface type.
The land surface type includes all land surfaces that are not covered by snow and not classified as desert. The snow surface type includes all land surfaces covered by snow excluding Antarctica and Greenland.
The Arctic surface type includes all pixels labeled as sea ice in the Northern Hemisphere. The Antarctica surface type includes all sea ice in the Southern Hemisphere and all snow-covered surfaces south of 60°S. On the basis of guidance from the MODIS cloud mask team located at the University of Wisconsin, Greenland was also included in the Antarctica surface type.
The desert surface type includes all pixels with a 3.75-μm surface emissivity less than 0.90 that occurred within 60 latitudinal degrees of the equator. The use of the 3.75-μm emissivity was used to ensure optimal performance for the 3.75-μm classifiers.
a. Global distribution of surface types
Figure 4 shows the global distribution of the surface types for 1 February 2009 (top) and 1 July 2009 (bottom). As Fig. 4 shows, the spatial coverage of these surface types varies with season with snow-covered land showing the most dramatic variation. The appearance of shallow ocean away from the coasts is due to the inclusion of heterogeneous SST regions (i.e., oceanic fronts) into this surface type.
b. Impact of surface types of cloud-detection performance
Figure 5 shows the incremental improvement in globally averaged probability of cloud detection (POD) that comes with the addition of new surface types. Beyond the seven surface types used here we felt the benefit of adding additional types was so small as to not be worth the additional complexity and use of resources. For example, when a mountain surface type was added, it increased the POD by only 0.0014. Each of the classifiers described next are generated separately for the above surface types. The resulting distribution of pixel counts for each surface type in the training data was the following: deep ocean: 59%; shallow water: 4%; land: 16%; snow: 5%; Arctic: 3%; Antarctic: 8%, desert: 5%. Note the numbers in Fig. 5 are different than the numbers in Table 2. If a global mean were computed in Table 2, it would match the asymptotic value of the curve in Fig. 5.
6. Cloud mask classifiers
The naive Bayesian formulation allows for multiple cloud classifiers to be used without the need for large arrays. This section briefly describes each of the six classifiers used in the naive Bayesian cloud mask. A more detailed description can be found in Heidinger (2011), as the same classifiers are used in Geostationary Operational Environmental Satellite-R (GOES-R) Algorithm Working Group (AWG) cloud mask algorithm. It is important to note that even with limited spectral information offered by the AVHRR, the number of cloud mask classifiers or tests can be large (≈10) and the specific number used here is 6. In the AWG cloud masks, we have decided to prioritize the infrared information to help ensure day–night consistency. In addition, we rely on radiative transfer calculations to reduce artificial sensitivities to variability in viewing geometry and the atmospheric and/or surface state.
a. Emissivity referenced to the tropopause (ETROP)
The first classifier is the 11-μm emissivity computed assuming the cloud resided at the tropopause. This classifier has also found use in cloud-typing routines (Pavolonis 2010). The ETROP test assumes that clouds are colder at 11-μm brightness temperatures than clear sky. Traditionally, window brightness temperatures are used in tests looking for cold pixels. The ETROP, however, operates on the 11-μm emissivity computed assuming the cloud resides at the tropopause. The emissivity is computed as etropo = (I − Iclear)/(Ibb,tropo − Iclear) where I is observed radiance, Iclear is the computed clear-sky radiance and Ibb,tropo is the radiance from a blackbody cloud emitting at the temperature of the tropopause.
The variation of etropo with the true cloud emissivity is shown in Fig. 6. In Fig. 6, the cloud is simulated using an ice cloud located between 300 and 400 hPa in a standard midlatitude summer atmospheric profile. The slope is constant and the ratio between the true and the tropopause emissivity is simply the ratio of (Ibb,tropo − Iclear)/(Ibb − Iclear), where Ibb is the radiance from a blackbody cloud emitting at the actual cloud temperature. For clouds within the troposphere, Ibb,tropo is always less than Ibb, and values of etropo are less than the actual emissivity. For the simulation in Fig. 6 where the cloud was placed roughly 200 hPa below the tropopause, the values of etropo are roughly 20% less than the true emissivity. Even though the values of etropo are much lower for low-level clouds, the accuracy of the clear-sky radiative transfer (especially over oceans) makes the ETROP classifier robust and effective. In clear conditions, the tropopause emissivity should approach zero. Negative values are possible when the computed clear-sky radiances are greater than the observed clear-sky radiances.
b. Relative thermal contrast (Tmax − T)
While the ETROP metric involves the absolute deviations of the 11-μm observations from the clear-sky estimates, the relative thermal contrast classifier (Tmax − T) works on the relative variation of the 11-μm observations. The underlying assumption is that a pixel significantly colder than its warmest neighbor is likely cloudy. The Tmax − T operates on the difference in 11-μm brightness temperature of a pixel and its warmest neighbor within a 5 × 5 pixel array. This test is designed to detect cloud edges and other small-scale cloud features. A benefit of this metric is that it does not rely on knowledge of the surface temperature. In some areas such as polar regions in the winter, clouds are warmer than the surface. The CALIPSO-derived classifier distributions in those regions should account for these conditions and automatically downplay the impact of the Tmax − T.
c. Four-minus-five (FMFT)
The FMFT is based on the split-window (11 and 12 μm) observations that are provided by channels 4 and 5 on the AVHRR. The 11–12-μm brightness temperature difference (BTD) increases in the presence of semitransparent cloud. For opaque cloud, this BTD often falls below the clear-sky value. The particular FMFT metric used here incorporates the clear estimate of the 11–12 BTD as shown below:
This classifier represents the difference between the observed 11–12 BTD and an estimate of the clear-sky value that is consistent with the observed 11-μm brightness temperature. When the 11-μm brightness temperature falls below 260 K, the classifier is set to BTD11,12. The goal of this formulation is to bring in information from the clear-sky model to make the classifiers account for variations in surface temperature and atmospheric moisture.
d. Daytime 4-μm pseudoemissivity (day 4-μm)
This test uses the 4-μm observations converted to a pseudoemissivity relative to the emissivity computed for the 11-μm observation. The actual wavelength on the AVHRR is 3.75 μm. Given the highly nonlinear relationship between radiance and brightness temperature at 4 μm, the pseudoemissivity can grow very large in the presence of cloud during the day. This formulation incorporates estimates of the surface emissivity at 4 μm from Seemann et al. (2008).
The 4-μm pseudoemissivity e4 is computed using the following relationship:
The actual classifier used in the naive Bayesian approach (χ) is defined as
where e4,clear is an estimate of e4 under cloud-free conditions and is computed as follows:
where is the clear-sky estimate of the 4-μm radiance that includes the effects of solar reflectance; is computed using the following relationship:
Here e4,sfc is the 4-μm surface emissivity, t4,sfc is the transmission for the solar to surface satellite path, μ0 is the cosine of the solar zenith angle, and F0 is the integrated amount of energy in the 4-μm channel. The daytime metric is scaled to better account for variations in solar zenith angle which impact both e4 and e4,clear.
Figure 6 shows the variation of the daytime 4-μm emissivity metric, given in Eq. (8), as a function of the actual 11-μm cloud emissivity for an ice cloud placed between 300 and 400 hPa in a midlatitude summer atmosphere with a solar zenith angle of 30°. The metric peaks at values much greater than unity for moderately opaque clouds.
e. Nighttime 4-μm pseudoemissivity (night 4-μm)
This classifier also uses 4-μm pseudoemissivity. Without solar illumination, the 4-μm emissivity for low opaque clouds can fall well below unity. For semitransparent and cold cloud, the 4-μm emissivity becomes very large. There is no need for solar zenith angle scaling as in the daytime classifier. The ranges of the day and night e4 values were different enough to warrant separate classifiers to improve performance. The nighttime 4-μm pseudoemissivity classifier is defined as the value of e4 without any scaling. Figure 6 shows the variation of the nighttime 4-μm pseudoemissivity metric given in Eq. (7). For visual convenience, the nighttime values of e4 plotted in Fig. 6 are offset by 1. The variation of nighttime and daytime 4-μm pseudoemissivity metrics are qualitatively similar for ice clouds as illustrated in Fig. 6. For water clouds, the nighttime metric can fall below the clear-sky values, and for this reason the nighttime and daytime classifiers are separated.
f. Reflectance at 0.63 μm (ref 0.63 μm)
The 0.63-μm reflectance is very important in cloud detection owing to the high reflectivity of clouds and relatively low reflectivity of most surface types. This classifier is the difference between observed 0.63-μm reflectance and the estimated value under cloud-free conditions. The clear-sky estimate is generated using the surface reflectance maps described by Moody et al. (2007) coupled with a Rayleigh and aerosol scattering model. While inclusion of this test is contrary to prioritization of the IR channels, it is necessary to maintain consistent performance during daytime periods when the 4-μm channel is not available. The 4-μm channel is not available during daytime operation of the AVHRRs on NOAA-17, Meteorological Operation satellite A (MetOp-A), and NOAA-16 (2000–03).
g. Turning off classifiers for specific situations
For some situations, not all classifiers can be run on all data. For example, the nighttime 4-μm emissivity classifier is used only at night and the daytime 4-μm emissivity and 0.63-μm reflectance classifiers are used only during the day. In addition, the daytime 4-μm emissivity and 0.63-μm reflectance classifiers are turned off over oceanic glint. In the naive Bayesian context, a classifier can be turned off if the class conditional probabilities for this specific feature are set to unity:
In the case of a single classifier, turning the classifier off will result in the posterior probability being equal to the prior probability.
7. Performance metrics generated from training dataset
Table 2 provides the metrics for the performance of the six-term naive Bayesian cloud mask generated from the 2007 training data. The data are displayed for each of the seven surface types described above. The first row shows the value of P(Cyes) in Eq. (1) which is the probability of a pixel being cloudy as determined in the training dataset. The second row shows the posterior probability of a pixel being cloudy as determined by the naive Bayesian cloud mask. All pixels with a posterior probability of cloud exceeding 0.5 were considered cloudy. In this context, the posterior and prior probabilities can be interpreted as the cloud fractions from CALIPSO/CALIOP and the naive Bayesian mask. Note the posterior cloud probabilities are always less than the prior cloud probabilities. This is an expected outcome because of the higher sensitivity of CALIPSO/CALIOP lidar to the presence of cloud relative to passive AVHRR observations. The fact that these prior and posterior climatological probabilities differ by roughly 5% does not impede the use of CALIPSO for training the AVHRR cloud detection. As shown in the metrics described below, our analysis does not indicate a prevalence of false detection (false) of the AVHRR cloud-detection results relative to CALIPSO.
The third through sixth rows of Table 2 provide the metrics of performance of the Bayesian cloud mask within the training dataset. In these calculations, a posterior probability of 0.5 was used to separate clear from cloudy. The third row shows the POD values, the fourth row shows the Kuiper–Hansen skill score, the fifth row shows the false-alarm rate (false), and the final row shows the rate of missed clouds (missed). In general, the best performance for all metrics is observed for the deep-ocean and shallow-water surface types with POD values exceeding 0.9 and skill values exceeding 0.8. For these surface types, the false rates are less than 5% and the missed rates are around 5%. The next best performance is seen for the land class, followed by the desert class. For these two types, the POD values hover near 90%. Both generate false rates less than 3% and missed rates less than 10%. Of the three frozen surface classes, the best performance is seen in snow-covered land followed by the Arctic. By far the worst performance is seen in the Antarctica surface type. For both Antarctica and Arctic, the false and missed rates exceed 10%. This variation of performance with surface type is expected and is one of the main reasons for their existence in this algorithm. As described by section 10, one of the strengths of the Bayesian approach is the ability to estimate uncertainties of the cloud-detection results and to allow users of these data to account for the variations in performance over different surface types.
Table 3 provides the variation of the POD metric for day and night conditions. During the night, the nighttime 4-μm pseudoemissivity is active and the 0.63-μm reflectance and daytime 4-μm pseudoemissivity classifiers are turned off. Table 3 shows that except for the desert surface type, the night performance is worse than the daytime. The maximum day–night difference in the POD metric is scene in the Arctic and Antarctic surface types with values being of about 0.07. The land surface type shows a difference of about 0.03 and the remaining surface types exhibit day–night POD difference of 0.02 or less. In the Arctic and Antarctic, day and night conditions correspond to different seasons and therefore potentially different cloud regimes.
For comparison, Table 4 shows the same metrics as Table 2 computed for a threshold-based cloud-detection scheme developed by the GOES-R AWG (Heidinger 2011). The AWG mask applied to the AVHRR uses the same classifiers as described here. The thresholds were derived from the same training dataset used here. As described in Heidinger (2011), the thresholds were set so that each classifier (or test in this case) gave a maximum false-alarm rate of 2%. The AWG mask pursued a threshold-based approach to give AWG applications the ability to ignore certain tests found inappropriate for specific applications. As Table 3 shows, the threshold-based application POD and skill values are always less than the naive Bayesian results for all surface types with the largest differences seen in the snow-covered land, Arctic, and Antarctic surface types. In practice, the threshold approach is more apt to generate false alarms because only one threshold needs to be exceeded for a cloudy result to be generated. This is consistent with results in Table 3, which shows the threshold-based scheme to generate higher false alarms and generally miss less cloud than the naive Bayesian scheme. Overall, the performance between the two is not drastically different. However, the systematic improvement in the POD and skill metrics along with availability of the uncertainty estimates (described in section 10) led to the adoption of the naive Bayesian formulation for PATMOS-x.
8. Generating a four-level cloud mask
Typically, most imager cloud masks provide four classifications of pixel cloudiness. Both the operational AVHRR and MODIS masks generate a cloud mask that can be clear, probably clear, probably cloud, or cloudy. In the naive Bayesian formation, an initial threshold on the posterior probability of 0.5 is employed to separate clear and cloudy pixels. A threshold of 0.9 is then applied to separate cloudy from probably cloudy pixels. A threshold of 0.1 is applied to separate clear from probably clear pixels. Figure 7 demonstrates this with the distribution of posterior probabilities generated from a single daytime scene. Note the large relative peaks of observations near zero and unity.
9. Example application to one AVHRR scene
To illustrate the performance of the naive Bayesian cloud mask at the pixel level, Fig. 8 was created. The upper-left panel in Fig. 8 is a false color image created with the 0.63-μm reflectance on the red gun, the 0.86-μm reflectance on the green gun, and the 11-μm brightness temperature (reversed) on the blue gun. The upper-right panel shows the cloud posterior probability for this scene. The lower left shows the derived four-level cloud mask and the lower right provides an estimate of the uncertainty in the cloud detection. The computation of the uncertainty is described in the next section.
Figure 8 illustrates some of strengths of this approach. A strong band of glint is easily discernible in the upper-left false color image. As stated above, a glint mask is used to detect the presence of glint. When glint is detected, the tests that are sensitive to glint are turned off using the procedure described above. In these conditions, the approach relies on fewer classifiers and the likelihood of a confident decision decreases. This is seen as the region of nonzero cloud probabilities in the clear-glint region and the associated higher levels of uncertainty in the lower-right image. The other regions that provide elevated uncertainties are regions of cloud edges and other small-scale features. While these regions are not represented in the training data because of filtering, the Bayesian approach is still able to correctly generate less certain results in these problematic conditions. In addition, even if the glint mask failed and glint-sensitive tests were applied to the glint-filled pixels, the non-glint-sensitive classifiers would act to mitigate the potential false detection of cloud. This robustness due to the interworking of the different classifiers is another key strength of the Bayesian approach over its threshold-based counterparts.
10. Estimating uncertainty
One of the strengths of the Bayesian approach to cloud detection is the ability to generate estimates of the uncertainty. As stated above, pixels with cloud probability exceeding 0.5 are assumed to be cloudy. For these pixels, the uncertainty estimate is defined as 1 − (posterior probability). For clear pixels where the cloud probability falls below 0.5, the uncertainty is simply defined as the posterior probability. Using this definition, the uncertainty cannot exceed 0.5. As shown in Fig. 8, the cloud-detection uncertainty is elevated in the presence of sun glint and cloud edges. The cloud-detection uncertainty is also elevated when the classifiers offer little skill in cloud detection such as for conditions that occur in the polar regions. This behavior varies with solar viewing conditions. Figure 9 shows the global variation in the cloud-detection uncertainty computed for the entire year of 2007 for NOAA-18 and NOAA-15. The upper-left panel shows the NOAA-18 descending node (0130 LST), the upper-right panel shows the NOAA-15 descending node (0700 LST), the lower-left panel shows the NOAA-18 ascending node (1330 LST), and the lower-right panel shows the NOAA-15 ascending node (1900 LST). All four images indicate that the uncertainties are highest in Antarctica with values approaching 0.3–0.4 over most of that region. Values of 0.2 are seen in the Arctic and in the high-latitude land regions. In addition, there is a distinct variation with observation time. Uncertainties of 0.1–0.2 are observed over most arid land regions during the terminator orbit conditions (0700/1900 LST). In general, the 1300 LST uncertainty field is the smallest of all four. In PATMOS-x, these uncertainty values are included in the dataset and are available for use in placing confidence estimates on all cloud fraction time series.
11. Comparison of the PATMOS-x Bayesian cloud-detection performance with other methods
The PATMOS-x approach for cloud detection derived here has been applied to the entire record of AVHRR GAC data (1978–2009). The cloud fractions and other PATMOS-x cloud properties were submitted recently to the Global Energy and Water Cycle Experiment (GEWEX) cloud climatology assessment project led by C. Stubenrauch. The GEWEX effort required satellite cloud climate datasets to be submitted in a network common data form (netCDF) format with a spatial resolution of 1° × 1°. The providers included over 10 groups. In this section, we compare the PATMOS-x results with the results from other satellite imagers flying in roughly the same time and that employ visible, near-infrared, and infrared channels for cloud detection. Therefore we compared with the results from the MODIS Science Team (MODIS-ST) (Frey et al. 2008; Ackerman et al. 2008), the MODIS Clouds and the Earth’s Radiant Energy System (CERES) Team (MODIS-CE) (Trepte et al. 2002; Minnis et al. 2008) and the International Satellite Cloud Climatology Project (ISCCP) D1 results (Rossow and Garder 1993). In addition, we included the CALIPSO Science Team (CALIPSO-ST) (Winker et al. 2009) results. All results except ISCCP were averages of the 0130 and 1330 LST datasets. The ISCCP data were the averages of the 0300 and 1500 LST data because there are no ISCCP products at 0130/1330 LST. The data used here are the monthly averages for 2007. These GEWEX datasets will be available to the public after the GEWEX assessment report is submitted in 2011.
The goal of this analysis is to demonstrate that the naive Bayesian cloud detection applied in PATMOS-x generates global cloud amount values consistent with those from other existing satellite-based climatologies that are generated using very different techniques applied to different sensors. The MODIS-ST and MODIS-CE datasets use cloud-detection schemes that are based on thresholds and utilize many more channels than that provided by the AVHRR and/or used in the PATMOS-x Bayesian mask. The native spatial resolution of the MODIS datasets is 1 km and is therefore much finer than the AVHRR GAC data. In contrast, the ISCCP cloud detection is based largely on comparison of the 0.63- and 11-μm observations to internally generated clear-sky estimates. The native spatial resolution of the ISCCP observations varies but is generally closer to that of the AVHRR GAC than the MODIS observations. One overall difference between PATMOS-x and the others is that PATMOS-x does rely more heavily on fast radiative transfer calculations and high-spatial-resolution ancillary datasets. As stated before, PATMOS-x does this to obtain globally consistent results with the limited information from the AVHRR. The comparison to these other datasets is a measure of the success of PATMOS-x in achieving this goal.
Figure 10 shows the mean global cloud fraction map derived for 2007 using the GEWEX submitted data for PATMOS-x, CALIPSO-ST, MODIS-ST, and MODIS-CE. For visual clarity, the CALIPSO-ST data were smoothed using a five-point kernel. In addition, the CALIPSO-ST results submitted to GEWEX use the 5-km product, resulting in higher cloud sensitivity and therefore slightly higher cloud coverage than that used in the derivation of the Bayesian classifiers. As Table 2 shows, the cloud amounts in the trainings set between the 1-km CALIPSO and the naive Bayesian results are more similar. The overriding feature of Fig. 10 is the high degree of similarity in the results for each dataset. Other features are apparent though. For example, the MODIS-ST results show more cloud over northeast Asia and the CALIPSO-ST results show an increase in tropical cloudiness.
To further quantify these differences, Fig. 11 shows the global maps of the mean of magnitude of the monthly differences relative to the PATMOS-x values. By computing the mean magnitude, we prevent positive and negative differences from cancelling each other out in the annual mean. Figure 11 shows the pervasive difference between CALIPSO-ST and PATMOS-x (and the others) in the tropics. The results show the CALIPSO-ST and PATMOS-x differ by up to 0.2 over some regions of the tropics and by roughly 0.1 over most other regions. Also, the MODIS-ST differences in northeast Asia are more apparent in Fig. 11. Included in the upper-left panel are the ISCCP results. ISCCP and PATMOS-x differences are largest over land with differences approaching 0.2 over central Asia. The MODIS-CE results show small mean differences with PATMOS-x over most regions. An analysis of the anomaly correlations of the monthly time series for 2007 indicated that all datasets were highly correlated in most regions, which indicates that differences shown in Fig. 11 are systematic biases. The exceptions are the differences with MODIS-ST in northeastern Asia and the differences between all datasets in Antarctica and Greenland.
Figure 12 shows a zonal cloud fraction distribution computed over 2007. The CALIPSO-ST is not available within 5° of the Poles. The zonal comparisons illustrate the relatively larger cloud amounts provided by CALIPSO-ST over the tropics. Excluding CALIPSO-ST, the results from other datasets do not vary by more than 5% over any zones outside of the polar regions. The PATMOS-x and MODIS-CE results appear to be within 2% for the nonpolar regions. Figure 13 shows the monthly averaged global cloud fraction time series for 2007 for each dataset. The shape of the results indicates that the general high correlation seen in the regional analysis also appears globally. CALIPSO-ST returns the highest global cloud fractions with values exceeding 70% for most months. If one ignores CALIPSO-ST, the other datasets including PATMOS-x all agree within 2% for each month and give global cloud fractions ranging from 64%–68%. Again, the PATMOS-x and MODIS-CE results agree to within 1% for each month of 2007.
In summary, the comparisons made possible by the GEWEX datasets indicate the naive Bayesian cloud detection applied in PATMOS-x generates global cloud fractions that are similar to other accepted satellite cloud datasets including those from advanced sensors such as MODIS. Given the AVHRR radiometric performance is similar to the NOAA-18 sensor used in this comparison, we can have confidence that PATMOS-x performance will remain stable throughout the 1980s and 1990s.
The naive Bayesian approach has been shown to offer an effective method to translate the unprecedented skill in cloud detection offered by CALIPSO into the AVHRR—a sensor that provides a uniquely long data record. The approach derived for the PATMOS-x algorithm used six classifiers developed over seven surface types. The resulting Bayesian cloud-detection scheme was shown to provide probability of correct detection (POD) metrics of roughly 90% over ocean, desert, and snow-free land; 80% over snow-covered land and Arctic sea ice; and roughly 70% over Antarctica and Greenland. Comparisons with existing global cloud-detection rates from other passive satellite sensor datasets indicate that PATMOS-x is in much closer agreement. Our analysis using data from 2007 indicates that PATMOS-x agrees to within 5% for global and zonal means with the monthly averaged data from ISCCP and two MODIS-based datasets. This algorithm has been applied successfully to all AVHRR data through PATMOS-x and a nearly continuous dataset from 1981 to 2009 is now available from the NOAA National Climatic Data Center.
In the future, we intend to enhance the ability to model clear-sky observations using improved ancillary data. We also plan to apply these techniques to the Visible and Infrared Imaging Radiometer Suite (VIIRS), which is the successor to the AVHRR with the goal of extending PATMOS-x beyond the POES era.
We appreciate the contributions of the GEWEX cloud climatology assessment led by Claudia Stubenrauch, and the GEWEX cloud data provided by William Rossow, David Winker, Steve Ackerman, Brent Maddux, and Patrick Minnis. The views, opinions, and findings contained in this report are those of the author(s) and should not be construed as an official National Oceanic and Atmospheric Administration or U.S. Government position, policy, or decision.