New methods and software for cloud detection and classification at high and midlatitudes using Advanced Very High Resolution Radiometer (AVHRR) data are developed for use in a wide range of meteorological, climatological, land surface, and oceanic applications within the Satellite Application Facilities (SAFs) of the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT), including the SAF for Nowcasting and Very Short Range Forecasting Applications (NWCSAF) project. The cloud mask employs smoothly varying (dynamic) thresholds that separate fully cloudy or cloud-contaminated fields of view from cloud-free conditions. Thresholds are adapted to the actual state of the atmosphere and surface and the sun–satellite viewing geometry using cloud-free radiative transfer model simulations. Both the cloud masking and the cloud-type classification are done using sequences of grouped threshold tests that employ both spectral and textural features. The cloud-type classification divides the cloudy pixels into 10 different categories: 5 opaque cloud types, 4 semitransparent clouds, and 1 subpixel cloud category. The threshold method is fuzzy in the sense that the distances in feature space to the thresholds are stored and are used to determine whether to stop or to continue testing. They are also used as a quality indicator of the final output. The atmospheric state should preferably be taken from a short-range NWP model, but the algorithms can also run with climatological fields as input.
Accurate objective and automatic identification of clouds and aerosols in satellite imagery is important for a large number of geophysical applications. The detection of clouds is not only an essential first step in a cloud-type classification or in the retrieval of other cloud parameters (e.g., the cloud-top temperature and height), it is also essential first to mask out any cloud- or aerosol-contaminated field of view (FOV) in order to retrieve both surface and atmospheric parameters. Retrievals of the sea surface temperature (SST), sea ice extent, snow cover, land surface albedo, and so on, all require and will assume absolutely cloud free conditions.
Furthermore, clouds and aerosols significantly affect the incoming shortwave and outgoing longwave radiation of the earth's atmosphere and therefore play a key role in the climate system and in attempts to predict or simulate the future climate. Studies using GCMs have shown that small changes in the earth's cloud cover could have the same effect on the planetary temperature as doubling or halving the atmospheric carbon dioxide (CO2) concentration (Slingo 1990; Houghton et al. 2001).
Much of the present uncertainty and inaccuracy of GCM predictions of the future climate stems from difficulties in simulating clouds and aerosol processes realistically. The recent availability of longer time series of satellite-observed cloud parameters, globally (Rossow and Schiffer 1991, 1999 and regionally (Karlsson 1997, 2003, provides important tools for independent verification of cloud-climate simulations.
Just as in climate prediction or long-range forecasting, clouds and precipitation also remain some of the most inadequately simulated parameters in short- and medium-range numerical weather forecasting. As a consequence, it is natural that much research has been devoted over the past two decades to the problem of objective cloud detection and classification from satellite data.
The experience at the Swedish Meteorological and Hydrological Institute (SMHI) of automatic cloud detection and analysis from satellite data stems mainly from the development and operational use of the SMHI Cloud Analysis Model Using Digital Advanced Very High Resolution Radiometer (AVHRR) Data (SCANDIA), which is a multispectral cloud analysis scheme based on the processing of National Oceanic and Atmospheric Administration (NOAA) AVHRR data. Principles of SCANDIA are described by Karlsson (1989), and a more comprehensive description is provided by Karlsson and Liljas (1990) and Karlsson (1996). SCANDIA is one of several operational AVHRR cloud algorithms that were developed since the late 1980s. Some other related schemes of the same era are the AVHRR Processing Scheme over Clouds, Land, and Ocean (APOLLO; Saunders and Kriebel 1988; Kriebel et al. 2003), Luminances en Exploitation (LUX; Derrien et al. 1993), and Clouds from AVHRR (CLAVR; Stowe et al. 1991, 1999.
The above-mentioned schemes have several similarities. The use of the typical differences in cloud appearances in all five spectral channels of the AVHRR instrument by applying sequences of threshold tests is common; however, threshold tests are coupled in SCANDIA, meaning that the identification of a cloud-filled or cloud-contaminated pixel requires that several threshold tests (one for each applicable image feature) must be passed. Most other comparable operational schemes use several separate and independent tests for detecting cloud-free pixels, meaning that if one of the tests fails, the pixel is denoted as cloudy (or cloud contaminated). The approach in which tests are grouped together is sometimes referred to as the grouped threshold method (Baum and Trepte 1999) and seems to be the standard of today’s threshold schemes, for example, the Moderate Resolution Imaging Spectroradiometer (MODIS) cloud mask (Ackerman et al. 1998) and the updated APOLLO (Kriebel et al. 2003).
A common weakness, or potential weakness, of the earlier schemes mentioned is that they employ static or partially static thresholds that are regionally specific. In some cases, thresholds depend on the season or the sun elevation according to a few classes, as in SCANDIA, but no dynamic adaptation is made to the actual state of the atmosphere and surface at each individual FOV, and no consistent correction for the sun–satellite viewing geometry is made.
In addition to the traditional threshold method, there are many other well-established approaches to automatic cloud detection and classification using satellite data. Spatial coherence methods (Coakley and Bretherton 1982; Coakley and Baldwin 1984) derive overpass-dependent thresholds from two-dimensional radiance histograms and need a homogeneous and dark (low reflectance and warm) background, and thus they may applied only over the sea. The International Satellite Cloud Climatology Project cloud detection (Rossow and Schiffer 1991, 1999, applied globally over both land and sea, makes extensive use of clear-sky radiance maps and derives thresholds from radiance histograms gathered over large segments. The thresholds are overpass dependent and dynamic but are constant over the individual segments. Dynamic clustering (e.g., Desbois et al. 1982) is also very dependent on the actual overpass and does not easily take onboard ancillary information. Furthermore, assigning a label to the found clusters is far from trivial and may require a large amount of training data. Bayesian methods (e.g., Uddstrom et al. 1999) require knowledge about the probability distribution of the data. For simplicity, a Gaussian distribution is often assumed. Satellite data, however, are generally not Gaussian distributed. Bayesian methods like neural networks (e.g., Yhann and Simpson 1995) require large training sets, which may be regionally specific. A neural network algorithm will require new training data if it is adapting to a new satellite sensor or another kind of ancillary information, for instance, when an NWP model that is being used is improved. An overview of some pattern-recognition techniques for the identification of cloud and cloud systems is given by Pankiewicz (1995).
Algorithm development has mainly been carried out in the framework of the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) Satellite Application Facility (SAF) projects, starting in 1997 (see, e.g., Schmetz et al. 2002). The AVHRR cloud-mask and cloud-type algorithms (cloud mask and cloud type, respectively, hereinafter) and products described here are essential SAF retrievals for high- and midlatitude applications. The cloud mask is the basis for the retrieval of a number of other cloud and clear-air parameters developed in the SAFs. It will be used within the SAF for Nowcasting and Very Short Range Forecasting Applications (NWCSAF) project as input to the cloud type, which, in turn, is input to the Cloud-Top Temperature and Height (CTTH) and Precipitating Clouds products, which were also developed by SMHI. Equally important is that the cloud mask will be used extensively both within the SAF for Ocean and Sea Ice Applications (OSISAF) and the SAF for Land Analysis Applications (LandSAF) to retrieve surface parameters such as SST and land surface albedo. In addition, the cloud-mask and cloud-type (and the CTTH) algorithms are used in the SAF for Climate Monitoring (CMSAF) project to build regional cloud climatological descriptions and as input to retrievals of other cloud parameters (e.g., cloud optical thickness and cloud water path).
Corresponding algorithms for the cloud and precipitation parameters mentioned above will be retrieved from the Spinning Enhanced Visible and Infrared Imager (SEVIRI) on board the Meteosat Second Generation platform satellites, the first of which was launched in August of 2002. The SEVIRI cloud-mask, cloud-type, and CTTH algorithms have been developed by the Centre de Météorologie Spatiale (CMS) of Météo-France. The development of the geostationary and polar algorithms has been done in close coordination between SMHI and CMS, which has resulted in identical output categories and, where applicable, in algorithms with similar design.
The algorithms have been developed specifically for the AVHRR on board the current and future polar-orbiting NOAA and EUMETSAT Meteorological and Operational Weather Satellites (METOPs). However, with the extensive use of radiative transfer model (RTM) simulations for which the exact spectral response of the satellite sensor is accounted, it is straightforward to adapt the algorithms to other satellite sensors (e.g., MODIS on board the Terra and Aqua platforms) in a consistent way, provided that spectral channel differences are not too large. In the current implementation, the algorithms automatically handle both the AVHRR data stream without channel 3A, as provided by all NOAA satellites prior to NOAA-17, and the AVHRR data stream, as on NOAA-17 (with channel 3B during night and 3A during day). See Table 1 for an overview of the AVHRR/3 channels and their approximate wavelengths.
During algorithm development, we used high-resolution picture transmission data received at Norrköping, Sweden, from the AVHRR sensor on board NOAA-12, -14, -15, and -16.
The final algorithms, which have been run operationally at SMHI since June of 2004, require calibrated and map-projected AVHRR data as input. The EUMETSAT AVHRR Advanced Television and Infrared Observation Satellite (TIROS) Operational Vertical Sounder (ATOVS) Processing Package (AAPP; Whyte et al. 2001) is used to generate AAPP level-1b data for which calibration and navigation information is appended. In a second preprocessing step, the AVHRR data are calibrated and remapped using a nearest-neighbor resampling. In normal processing, a polar stereographic map projection and a pixel size of 1000 m × 1000 m is used. In this way, there is only a weak oversampling, and the loss of data is kept to a minimum. Using a nearest-neighbor resampling, no radiance values are altered. All further processing is performed in this (user defined) map projection and on one or several (user defined) areas.
Sun and satellite angles
The instantaneous sun–satellite viewing geometry is stored for every satellite FOV. The angles are the solar zenith angle θsun, the satellite zenith angle θsat, and the sun–satellite azimuth difference angle; θsat is always positive and increases from 0° at the subsatellite point to about 68° at the edges of the swath. The azimuth difference is defined as
where δϕ′ = |ϕsun − ϕsat|. Here, ϕsun is the sun azimuth (counted clockwise from the local north) and ϕsat is the satellite azimuth. Surface specular reflection on a perfect mirror will be observed by the satellite when δϕ = 180°, provided that θsun is equal to θsat.
Angle θsun is used to define what we refer to as day, night, and twilight when structuring the algorithm in different test sequences using different grouped threshold tests and different threshold offsets (see section 3 in general and section 3g in particular). According to the definition used here, it is day when θsun ≤ 80°, night when θsun ≥ 95°, and twilight for values of θsun that are in between.
Land cover characterization and elevation
We use the 1-km global land cover characterization database (land use hereinafter), which is available from the U.S. Geological Survey (USGS) (see Anderson et al. 1976 ,Eidenshink and Faundeen 1994), mainly for separating land and water surfaces but also to identify barren and dessert areas where shortwave IR emissivities are significantly below 1 (Salisbury and d'Aria 1994). Digital elevation model data are derived from the global digital elevation model 30 arc s topography database (GTOPO30; available at the time of writing online at http://edcdaac.usgs.gov/gtopo30/gtopo30.asp) and are used to separate low and gentle terrain from high and rough mountainous terrain.
The cloud-mask and cloud-type algorithms need a reliable estimate of the column-integrated water vapor (wv), the surface skin temperature Ts, and the temperature at the 950-hPa level T950, in full (map projected) pixel resolution. In addition, the cloud type also needs the atmospheric temperature at the 850-, 700-, and 500-hPa levels and the tropopause. These variables are used to divide the opaque clouds according to the cloud-top height (see Dybbroe and Thoss 2004).
The wv and Ts are input to the derivation of the dynamic thresholds, and Ts and T950 are used to identify low-level temperature inversions. Temperatures at the other vertical levels are used together with T950 to divide the opaque cloud types into different heights, which is useful for nowcasting.
Because the cloud mask and cloud type are used for nowcasting purposes, processing time becomes critical, and we therefore derive the thresholds prior to the satellite overpass. As a result, the best available atmospheric data are those provided by high-resolution NWP models. The software is able to ingest data from several NWP centers, including the European Centre for Medium-Range Weather Forecasts (ECMWF). During the algorithm development and validation, we have used data from the High-Resolution Limited-Area Model (HIRLAM; Undén et al. 2002). HIRLAM is a complete NWP system, including data assimilation (remote sensing data included) and a limited-area forecasting model, using a boundary relaxation scheme, with a comprehensive set of physical parameterizations. The system is developed jointly by the National Meteorological Services in Denmark, Finland, Iceland, Ireland, Netherlands, Norway, Spain, and Sweden. The forecast model exists in both a gridpoint version and a spectral version (used for the variational data assimilation). The gridpoint model has a nonhydrostatic kernel and uses a semi-Lagrangian discretization of the multilevel primitive equations using the hybrid coordinate η in the vertical direction. The version used in our development had 31 vertical levels and a horizontal spacing of 44 km. [See Undén et al. (2002) for details about the HIRLAM system.]
We use short-range forecasts with lead times between 6 and 15 h. The lower limit of 6 h avoids possible problems related to model spinup. HIRLAM forecast output is available in 1-h resolution, and we take the forecast field closest in time to the satellite overpass. The coarse-resolution NWP data are being remapped to pixel resolution (e.g., 1 km), using bilinear interpolation, before being input to the derivation of thresholds or the cloud-mask and cloud-type test sequences. The interpolation is done separately over land and water and is then merged using the high-resolution land-use data to resolve the sometimes high gradients, especially in Ts, in the coastal zone.
The cloud mask and cloud type are, in practice, two individual retrievals, where the cloud type can be seen as a postprocessing step to the cloud mask.
The cloud mask seeks to identify all absolutely cloud free pixels in a satellite overpass, and, when the sun is above the horizon and cloud-free conditions prevail, the algorithm additionally provides information on the presence of sea ice or snow cover on the ground.
Except when outside the swath or when processing is stopped by erroneous input data, a pixel may take one of the following four values: cloud free, cloud contaminated, cloud filled, or snow/ice contaminated. The cloud mask also produces for each pixel a set of processing flags to describe the prevailing conditions (viewing geometry, atmospheric state, etc.) under which the pixel was classified, and a set of threshold test flags to keep track of the decisive (see section 3g) threshold tests.
Pixels with excessive aerosol contamination from smoke from biomass burning or from heavy industrial pollution events are classified as cloudy by the cloud mask. During the daytime, however, and only when 3.7-μm data are available, an additional aerosol/fire-flags product may be derived, aiming at the detection of these events as well as volcanic plumes and hot spots from forest fires. This product is not yet fully mature because of a lack of training and validation data (see Part II), and the methods are based directly on the literature (e.g., Baum et al. 1997).
The cloud-mask output is input to the cloud type, which performs a classification on the cloud-filled and cloud-contaminated pixels into 10 different cloud categories. Figure 1 shows an example image of the cloud type, together with the corresponding red–green–blue (RGB) color composite, using channels 1, 2, and 4 over northern Scandinavia at midday on 2 May 2003. The 10 cloud categories are grouped into opaque clouds (five different classes according to the height), semitransparent cirrus clouds (four different classes according to the opacity), and fractional clouds (one class). Similar to the cloud mask, a set of processing flags is attached to the cloud-type output. The cloud-type algorithm and output are further detailed in Dybbroe and Thoss (2004).
In the development and design of the new SAF cloud-mask and cloud-type algorithms, three things have been of particular importance as compared with the operational SCANDIA and similar cloud classification schemes. First, we wanted to improve corrections for bidirectional and atmospheric effects. Second, rather than static climatologically determined thresholds, wherever possible, dynamic thresholds that depend on the current atmospheric state, the actual illumination, viewing conditions, and the state of the surface at the satellite footprint were used. Third, it should be possible to adapt the algorithms in a consistent way to future AVHRR and AVHRR-like sensors.
The way in which we have chosen to meet these requirements is to use RTMs to simulate the satellite signal as it would have been observed in cloud-free but otherwise similar conditions.
Table 2 shows the utilized image features for AVHRR/2 (no channel 3A) and AVHRR/3 (channel 3A during the day, and channel 3B during the night). As seen from Table 2, we use both spectral features and local texture (spatial variability) features. The spectral features are both simple linear combinations (differences) of channel reflectances or brightness temperatures and nonlinear features, such as the ratio of the 1.6- and 3.7-μm bidirectional reflectances over the 0.6-μm bidirectional reflectance. Only the feature created from the difference between T11 and the forecast Ts employs nonsatellite data. The texture features are all derived by taking the standard deviation of a spectral feature over a 5 × 5 pixel box centered on the given pixel.
The feature creation and selection has been done, to a large extent, following past experience with using thresholds for AVHRR data for cloud classification, as in, for example, SCANDIA, APOLLO, LUX, and CLAVR. However, there are also a few new features as compared with these known cloud schemes, for example, QR3.7R0.6 and (T3.7 − T12)text [see Table 2 and sections 3c(1) and 3c(3) below].
There is, of course, a significant amount of redundant information in this large set of features used. The dimension of the feature space described by the 14 selected features is not 14 but rather is somewhat smaller, and many of the features are strongly dependent on each other. A principal component analysis, using a hypothetical large amount of training data, might show that the features chosen are not optimal for the description of the feature space. However, for our purpose, it has been important to select features for which it is possible to reason in a consistent physical manner on how they may be affected when clouds with different physical characteristics (in terms of particle size, shape, and phase distribution) fill the FOV.
Below, we give a short summary explanation and motivation for the image features listed in Table 2. More physical explanations and how the features are applied can be found in Tables 3, 4, 5 and 6. Table 7 provides an overview of how often the individual feature tests are successful for all of the cloudy and all of the snow-covered pixels. For a more comprehensive presentation and discussion of the features selected, see Dybbroe and Thoss (2004). For the standard image features introduced previously in, for example, SCANDIA, APOLLO, LUX, or CLAVR, see Karlsson and Liljas (1990), Karlsson (1996), Saunders and Kriebel (1988), Derrien et al. (1993), and Stowe et al. (1991).
Visible (VIS)/near-infrared (NIR) features
We employ six different spectral features using the solar reflectance information from AVHRR channels 1, 3A, and 3B (see Table 2). When the sun is sufficiently above the horizon (sun elevation higher than 4°) the bidirectional reflectance at 0.6 μm (R0.6) is used to detect any cloud- or snow-contaminated pixel over a dark background (snow-free and cloud-free land or ocean surface). Feature R0.6 is tested using dynamic thresholds (see section 3f). However, we use conservative threshold offsets, ensuring the actual threshold values to be relatively far away from the average (or normal) cloud-free values. This is due to known difficulties simulating realistically the top-of-atmosphere (TOA) bidirectional reflectance. Proper surface bidirectional reflectance distribution functions (BRDFs) and RTM simulations are being used over both land and water (see section 3f). Poor knowledge about the actual state of the surface, as well as large spatial and temporal variations, within the satellite FOV is an important limiting factor here.
When the sun is close to the horizon (sun elevation lower than 4°) R0.6 cannot be used. Instead, we utilize the bidirectional reflectance at 0.6 μm (pseudo-R0.6), assuming that the sun is in zenith [i.e., not applying the correction factor 1/cos(θsun)]. This feature is introduced to utilize fundamental reflection differences (e.g., between snow/clouds and snow-free/cloud-free surfaces) even at very low sun elevations. It is tested against an empirically derived threshold that is linearly dependent on the sun elevation. The feature is especially important in cases in which a pixel exhibits illumination and channel 3A is still not switched on, because the channel-3B signal becomes ambiguous at very low sun elevations.
The bidirectional reflectance at 1.6 (R1.6) or 3.7 (R3.7) μm, either used alone or in combination with R0.6 as reflectance ratios (QR1.6R0.6 and QR3.7R0.6), is particularly useful in the detection of low water clouds over snow-covered surfaces (over land and sea ice) and over water bodies in sunglint (see Part II, their Fig. 7); R3.7 is derived by subtracting the emission part from the 3.7-μm radiance using AVHRR channel 4 under the assumption of unit emissivity [see Allen et al. (1990) for a derivation].
As seen from Table 2, we do not use AVHRR channel 2 at 0.9 μm in the cloud masking despite its proved usefulness in cloud detection (e.g., Saunders and Kriebel 1988). The feature, combining channels 1 and 2 either as a difference (as in SCANDIA) or as a ratio (as in APOLLO), has proven to be valuable, especially at low sun elevations. However, we have found that for cloud-screening purposes the 0.9-μm reflectance does not add significant additional information that is not already available in the VIS and NIR channels 3A or 3B. Also, the bidirectional reflectance at 1.6 and 3.7 μm carries more information on the microphysical state of the cloud top as compared with the signal at 0.9 μm.
Five spectral features utilize the thermal infrared signal of the AVHRR channels 3B, 4, and 5. The difference of T11 minus T3.7 (T11 − T3.7) is used to detect water clouds at night and low clouds in shadow. This is the well-known feature introduced by Eyre et al. (1984), using the findings of Hunt (1973).
The 11-μm brightness temperature (T11) is normally compared with the estimated thermodynamic skin temperature Ts in the difference feature T11 − Ts. Only in the detection of snow-covered areas, where Ts is generally more inaccurate, do we use T11 alone using a static threshold (see Table 3). T11 − Ts is normally used with a high offset to avoid misclassifying cold land surfaces as clouds, and it is only used alone when all other usual low-cloud features fail (in twilight and at nighttime when the cloud droplets are sufficiently large so as to produce a signal resembling that of ice clouds). See Table 4 for the exact values used.
The IR difference features T11 − T12 and T3.7 − T12 are used for the detection of thin cirrus as well as all types of nonhomogeneous (over the FOV) clouds. The thin cirrus detection is based on the fact that the transmissivity of these clouds decreases with increasing wavelength (Inoue 1985). Because of contamination by solar reflectance in channel 3B, thin cirrus detection during day is done using T11 − T12. However, T3.7 − T12 is more sensitive to thin cirrus, except over desert or barren surfaces (where the emissivity at 3.7 μm is much lower than 1), and has proven useful for nighttime cirrus detection on the new NOAA (-15 and later) satellites where the earlier-observed periodic noise (Warren 1989) seems to have been greatly reduced.
Because of the nonlinearity of the Planck function, clouds only partially filling the FOV give rise to an enhanced signal (over the cloud-free case), especially in T3.7 − T12 (Saunders 1986).
Three local texture features using AVHRR channels 1, 3B, 4, and 5, are used to detect cloud edges and subpixel clouds over sea. They complement the IR-difference features discussed above. All three of the texture-based tests utilize the solar reflectance when the sun is above the horizon, as shown in Table 6, to avoid misclassifying clear pixels in sea surface temperature gradient zones as being cloud contaminated. The purely IR based feature T11text is used either in combination with (T3.7 − T12)text or R0.6text. The (T3.7 − T12)text is not used in sunglint situations, in which instead cloud-edge detection is attempted using R0.6text and T11text in combination. The motivation for the local texture in T3.7 − T12 is this feature’s sensitivity to a nonhomogeneous FOV, as discussed above.
Low-level temperature inversions
Separation of low clouds and cloud-free land or water surface during night and in twilight sometimes becomes a particularly challenging, if not impossible, task when using satellite data alone.
In the absence of sunlight, the difference T11 − T3.7 is normally large enough to allow detection of low water clouds. However, exceptions do occur when this difference is so small that separation from the cloud-free case becomes ambiguous. Over the open ocean, low clouds may contain large water droplets, and during the mid- to high-latitude winter season, low clouds may sometimes consist of a sufficiently high amount of ice particles. In both cases the shortwave IR emissivity at the cloud top is high enough to erase the normally existing difference T11 − T3.7.
When the sun is just above the horizon, T3.7 is raised by the additional radiance from reflected sunlight. At some stage, this additional radiance completely cancels the difference T11 − T3.7, and water-cloud detection becomes impossible. The new configuration employed, for example, on NOAA-17, in which channels 3A and 3B are switched when passing between day and night, does not eliminate this problem. First, because the switching is done from one line to another, there will inevitably be regions without sunlight and without channel 3B data. Second, even with the sun above the horizon, the 1.6-μm (or 3.7 μm) reflectance signal might not be sufficiently strong to be used unambiguously to separate water clouds from cloud-free land surfaces.
In the above-described situations, the only possibly useful feature is T11 − Ts. However, it requires that the cloud top is sufficiently colder than the surface, and this is not always the case. In particular over land and during the winter season, low-level temperature inversions are normally very frequent and the cloud top is often as warm as, or even warmer than, the surface.
Comparing HIRLAM short-range forecasts and radiosonde profiles during wintertime situations over northern Europe, we defined a low-level inversion indicator as Ts − T950. Even though the strength is often not correctly predicted, HIRLAM proved generally capable of correctly detecting the presence of low-level inversions. The 950-hPa level may seem to be low, but the situations on which we focus are those occurring in stationary situations with usually a high surface pressure. No attempt is made to detect inversions in high terrain, because NWP profiles are less reliable in lower levels here. For the same reason, the threshold offset in T11 is generally set to be high in mountainous terrain (see Table 4). Also, the low-level temperature-inversion check is not applied over open water. Here, Ts is high and normally well forecast, inversions are seldom very strong, and, most important, the surface is black and fairly smooth, making reflectance and local texture features more useful.
The presence of a low-level inversion as detected with this method is a direct indication of a high risk for significant errors in the estimated Ts (frequently the skin temperature is much lower in reality). If an algorithm were to ignore such expected errors and rely on the T11 only and its threshold (derived assuming a correct Ts), cloud-free areas at night or in twilight would frequently be misclassified as cloudy.
When a low-level inversion is detected, a particular bit in the processing-flags dataset is set, implicitly indicating low quality. If the inversion is strong (defined by Ts − T950 ≤ −5 K), no cloud detection using only the T11 feature is attempted, and thus certain wintertime stratus and areas of water clouds in twilight may remain undetected. However, if the inversion is weak (0 > Ts − T950 > −5 K) or nonpresent according to the NWP model, the T11 feature is utilized to detect cold, low clouds without any clear nighttime water cloud signal in the T11 − T37 feature. The T11 offsets are provided in Table 4.
Probability for sunglint
Sunglint is the specular reflection of sunlight over water surfaces. The probability of sunglint (Psunglint) is calculated using the formula derived by Berendes et al. (1999), which depends purely on the sun–satellite viewing geometry. Berendes et al. derived their formula from the distribution found by Cox and Munk (1954), assuming an isotropic average wind field with wind speeds in the range of 0–14 m s−1. We use this Psunglint measure only as an indicator for risk of sunglint. If the probability is larger than a threshold, currently set to 0.1%, we activate the processing flag for the presence of sunglint and invoke specially designed sunglint threshold sequences.
Derivation of thresholds
For every satellite overpass we derive a unique set of feature thresholds in full pixel resolution and applicable only for that particular overpass. The thresholds applied to the spectral features R0.6, T11, T11 − T3.7, T11 − T12, T3.7 − T12, and T11 − Ts are what we call dynamic and have all been derived by calculating the cloud-free satellite signal using RTM simulations. They depend on the actual atmospheric state, the surface characteristics, and the sun–satellite viewing geometry. The applied threshold actually consists of a theoretically derived value such as the one illustrated in Fig. 4 (described below) and a, usually small, constant offset that may vary with the underlying surface (land, open water, or coast), the sun–satellite viewing geometry and illumination (night, twilight, day, or sunglint), presence of low-level temperature inversion, or topography (low or high terrain). The constant threshold offsets are found from algorithm tuning and are further discussed in Part II and in Dybbroe and Thoss (2004). Offset values for the different grouped tests are provided in Tables 3, 4, 5 and 6 and in Dybbroe and Thoss (2004).
For the rest of the features (see Table 2), that is, the textural features and the VIS and NIR reflectance features (except R0.6), the thresholds are determined empirically and are either static or depend linearly on one or two view-geometry parameters (usually θsun or δϕ).
The lack of both a sufficiently reliable global land-use database and accurate surface BRDFs, together with uncertainties in AVHRR geolocation, the high anisotropic behavior, and the high pixel-to-pixel variability of the reflectance at 1.6 and 3.7 μm, prevented us from trying to simulate the TOA signals at these wavelengths.
The TOA IR radiance received at the satellite sensor under cloud-free conditions depends on θsat, the surface emissivity (ɛs), Ts, the atmospheric temperature profile, and the amount and distribution of water vapor, ozone, CO2, and other atmospheric gases along the observation path. We use the Radiative Transfer Model for TOVS, version 7 (RTTOV-7; Eyre 1991; Saunders et al. 1999; Matricardi et al. 2001), and vary only the water vapor, assuming that all of the other atmospheric constituents are constant.
Though it is feasible to run the RTM on the full atmospheric profiles of temperature and humidity, as provided by an appropriate (close in time) NWP short-range forecast, we think it would be very difficult to tune such thresholds consistently afterward. The thresholds would be sensitive to the actual shape of the profiles and, thus, sensitive to the NWP model that was used. We have instead chosen to follow another approach—approximating the full atmospheric profile with the surface skin temperature and the total column-integrated water vapor. This allows for the tabulation of IR thresholds as a function of θsat, Ts, and wv. In Fig. 2 we show a real image example of the T11 − T12 feature, together with the three corresponding input fields.
TOA cloud-free radiances of all of the IR channels of each individual AVHRR sensor were simulated, using RTTOV-7 on a wide range of atmospheric profiles and varying θsat and ɛs. Figure 3 shows the dependence of T11 − T12 on wv and Ts. Column water vapor and surface temperature are highly correlated, with high temperatures generally associated with large water vapor amounts. Although there is considerable variability in T11 − T12 for a given value of wv or Ts, especially in warm and humid atmospheres, the correlation between Ts and wv gives rise to a reasonably repeatable relationship between T11 − T12 and Ts for a given θsat and ɛs, as shown in Fig. 4. For each water vapor interval of 0.5 g cm−2 size, we made a linear regression and added two standard deviations and a temperature-dependent noise-factor (NEΔT) offset to arrive at the tabulated thresholds. The noise is mainly due to the digitization error converting radiances to brightness temperatures having a nonlinear Planck function.
The input atmospheric states have been taken from the Thermodynamic Initial-Guess Retrieval (TIGR) database (see Chedin et al. 1985). Over land, we have assumed ɛs = 1.0. Though IR emissivities over land are far from constant, this assumption proved necessary because the global 1-km USGS land cover characterization database was found to be too inaccurate to allow for variable IR emissivities, such as those provided by Salisbury and d'Aria (1992, 1994With an inaccurate land use and navigation errors on the order of one or more AVHRR footprints, a variable IR emissivity is likely to introduce unwanted errors. Instead, the uncertainty in ɛs over land is taken care of by using constant threshold offsets (see Part II, and the example provided in their Fig. 5). However, in barren and desert areas, feature tests involving the 3.7-μm channel are done with special offsets, thereby avoiding an otherwise likely overdetection of low clouds due to ɛs significantly lower than 1 at shorter IR wavelengths (Salisbury and d'Aria 1994).
Over sea we use the emissivities as tabulated by Masuda et al. (1988), assuming an average wind speed of 7 m s−1. Wu and Smith (1997) have developed an improved emissivity model for the sea surface, with a more rigorous basis in physics than that of Masuda et al. The model of Wu and Smith is more accurate, especially for very large view angles, but of importance here is that they showed that the Masuda et al. model agrees well with observations for moderate wind speeds and for view angles less than about 60°. Whereas the findings of Wu and Smith must be taken into account when attempting to retrieve SSTs, we conclude that the emissivity model of Masuda et al. is sufficient for cloud detection.
Thus, because ɛs over sea depends only on θsat, the IR threshold tables are only three-dimensional (θsat, wv, and Ts) over both land and sea. The actual applied thresholds (including static offsets) in addition depend on the surface characteristics (land or sea), whether a low-level temperature inversion is present, and so on.
In Fig. 5 the T11 − T12 thresholds applied to NOAA-16 are displayed as a function of Ts for three different view angles. The additional third variable wv is fixed using its correlation with Ts as given by the TIGR dataset. (Small differences occur between sensors, but the general picture shown in Fig. 5 is applicable at least from NOAA-12 to -17.) The corresponding thresholds applied in APOLLO (see Saunders and Kriebel 1988) for cloud detection at midlatitudes are shown for reference. Even though the APOLLO thresholds are tabulated as a function of the cloud-free T11 rather than the Ts, the comparison does show that the use of the T11 − T12 feature is somewhat different between the two schemes. The APOLLO thresholds were derived using 117 tropical and midlatitude maritime atmospheres, and no distinction is made between water and land surfaces or the various satellite sensors. It can be seen from Fig. 5 that, in general, the dependence on surface temperature is stronger in APOLLO than for the NWCSAF cloud mask. The thresholds applied in APOLLO are too low over cold surfaces (and dry atmospheres) and are too high over warm surfaces (and humid atmospheres) as compared with the NWCSAF cloud mask. It is here important to note that the NWCSAF cloud mask furthermore employs a quality margin of 0.5 K (see section 3g).
For AVHRR channel 1, we simulated the cloud-free TOA reflectance using the “second simulation of the satellite signal in the solar spectrum” (6S) radiative transfer model developed by Vermote et al. (1997). We used the U.S. Standard Atmosphere, 1976 with variable water vapor content, and we used variable surface characteristics. Because of a lack of reliable and easily accessible sources of the aerosol content, we assumed it to be constant and used the distributions available in the 6S model. We also used the constant total ozone content provided by the U.S. Standard Atmosphere, 1976.
Over land, we used the BRDF derived by Roujean et al. (1992), assuming a mixture of open land and forest that was independent of the actual surface type. Again, the USGS land use was far too inaccurate to allow discrimination into different land surface classes. The sea surface BRDF is given by the statistical model of Cox and Munk (1954) by assuming isotropy and an average wind speed of 7 m s−1.
Thus, TOA cloud-free reflectances are tabulated as a function of viewing geometry (θsat, θsun, and δϕ) and water vapor content.
The cloud-mask test sequence
The algorithm consists of a sequence of grouped threshold tests. In a grouped threshold test, several of the features listed in Table 2 are tested together. For a grouped threshold test to be successful (decisive), all feature tests constituting the particular grouped test have to be successful for the given pixel.
Because the dynamic thresholds have been derived from RTM simulations of cloud-free land and ocean surfaces in the absence of snow cover (or sea ice), an identification and screening of the snow-covered cloud-free surface are attempted prior to any cloud-screening tests. The full set of grouped threshold tests used in the snow screening can be found in Dybbroe and Thoss (2004), and two examples are given in Table 3. All grouped threshold tests used in the cloud screening are provided in Tables 4, 5 and 6.
Some degree of fuzziness is introduced in the threshold method by keeping track of each feature’s distances to its thresholds. The test sequence is continued until a grouped threshold test is successful and all of its feature tests are passed with a safe margin. That is, all features in the grouped test must be sufficiently far away from the thresholds if the test is to be decisive. Table 8 presents the threshold margins used for each feature.
If the test is passed and all of the features have values far away (defined by the safe margins) from their corresponding thresholds, the output is set to 2 (snow/ice contaminated), 3 (cloud contaminated), or 4 (cloud filled), depending on the purpose of the test, and the test sequence is stopped. If the test is passed but one or more of the features have values close to their corresponding thresholds (i.e., for a positive threshold thr, thr − margin < value < thr + margin), the output is set to 2, 3, or 4 depending on the purpose of the test, a dedicated low-quality flag is set, and the test sequence is continued. If the test is not passed and at least one of the features have a value that is far beyond its threshold, the output is set to 1 (cloud free), the low-quality flag is removed (if set), and the test sequence is continued. If the test is not passed but all features have values close to their corresponding thresholds, the output is set to 1, the low-quality flag is set, and the test sequence is continued.
The threshold algorithm is logically divided into a number of subalgorithms, depending on the actual situation, in terms of
illumination (day, twilight, and night),
geography (land, coast, and sea),
elevation (high or low terrain),
presence of a low-level temperature inversion.
All of the above variables put special requirements on the algorithm. Different environments and illumination conditions require different feature tests and, sometimes, also different thresholds or threshold offsets.
We only consider low-level temperature inversions during night and twilight over land and coast, and only in low terrain. The presence or absence of temperature inversions is relevant only when testing the observed T11 against the forecast Ts. During day, other more effective features are available and, furthermore, temperature inversions are far less frequent.
Sunglint is mainly, but not exclusively, considered over the coast and sea. Because certain land surface types (e.g., wet soil and snow cover) display highly anisotropic scattering, as in sunglint, the Psunglint parameter is considered over land as well.
Potentially this gives 8 land, 13 coast, and 5 sea algorithms, resulting in a total of 26 different algorithms. However, in our implementation we distinguish only between 8 coast algorithms instead of 13, because we implicitly treat the sunglint problem within the 3 twilight and 2 day algorithms.
For example, the sequence of tests for daytime sea, listed in the order that they are applied, is as follows [each test results in either a cloud-contaminated (cc) or a cloud-filled (cf) pixel]: snow/ice sea—cf; cold cloud—cf (T11offset = −13 K); cold bright cloud—cf (T11offset = −7 K, R0.6offset = 0.15); bright cloud—cf; cold water cloud—cf; cold clou—cc (T11offset = −7 K); thin cirrus secondary—cc; texture VIS—cc; texture IR combined—cc. The grouped threshold tests referred to are described in Tables 3, 4, 5 and 6. For details on the remaining 20 algorithms see Dybbroe and Thoss (2004).
The cloud-type test sequence
The cloud-type classification employs a sequence of threshold tests in an if–else–if structure. In contrast to the cloud-mask algorithm, in which testing is continued unless the result is cloudy and all of the features tested have values far from the thresholds, the cloud-type algorithm stops if a threshold test is successful. The cloud-type test sequence is detailed in Dybbroe and Thoss (2004).
Summary and concluding remarks
In the framework of the EUMETSAT Satellite Application Facility projects, we have developed a set of new methods and software for cloud detection and classification at mid- and high latitudes using AVHRR data. The algorithms are based on a grouped threshold approach, making extensive use of RTM calculations to simulate the cloud-free signal in otherwise similar conditions.
This is dramatically different from the approach applied in the current operational cloud classification SCANDIA and its related cloud schemes developed in the late 1980s, such as APOLLO, LUX, and CLAVR, which are more empirically tuned and regionally dependent. Using RTM simulations, it is not only possible to get more optimized and applicable thresholds that are valid for the particular state of the atmosphere and surface at the pixel level, but the algorithms are also easily adapted to other AVHRR-like sensors (e.g., MODIS and the Visible and Infrared Spin-Scan Radiometer on board the future National Polar-Orbiting Environmental Satellite System satellites) because the actual spectral response is implicitly taken into account.
Also, in contrast with the threshold schemes mentioned above, the NWCSAF AVHRR cloud mask is fuzzy in the way that the distances in feature space to the thresholds are stored and used to determine whether to stop or to continue further testing. These uncertainty margins around the thresholds are also used to set a quality indicator of the final output. All decisive threshold tests have to have feature values sufficiently far away from the thresholds.
In addition to the satellite–sun viewing geometry and the land surface characteristics, the actual state of the atmosphere in terms of temperature and humidity that is valid for the individual pixel is used in the derivation of the thresholds. The atmospheric state is taken from a short-range NWP forecast in the case of nowcasting applications (NWCSAF), but for climate applications (CMSAF) this information may be taken from an appropriate numerical analysis such as is provided by the ECMWF 40-yr reanalysis dataset (Uppala 2001). To avoid or minimize any possible negative effects of large errors in the atmospheric profile caused by errors in the NWP-model cloud cover, we have chosen not to feed the RTM with the full NWP model profile in the generation of the dynamic thresholds. Instead, we have reduced the problem, describing the atmospheric state by only the surface temperature and the total column-integrated water vapor.
Much effort has been put into efficient cloud detection under the extreme conditions frequently occurring at high latitudes (such as low sun, snow cover, and sunglint) and in the presence of low-level temperature inversions. For this work, we have introduced some new features as compared with those of SCANDIA— a sunglint probability function and a low-level inversion indicator using the difference in temperature between the surface and 950 hPa.
Over the fairly homogeneous ocean surface (away from sea ice and sunglint), we apply texture features involving more than one channel in order to detect subpixel clouds with high confidence. This approach is not possible to the same extent over land and is currently not attempted, inevitably leaving subpixel clouds more frequently undetected over land. Also, the distinction between subpixel water clouds and thin cirrus is ambiguous with the algorithms described here.
Validation results are presented and discussed in Part II. Real-time cloud products are displayed online at http://www.smhi.se/saf.
Much of the work reported here has been sponsored by EUMETSAT through the Now-casting and the Ocean and Sea Ice SAFs. Noëlle Scott of the Laboratoire Météorologie Dynamique, Ecole Poly-technique, France, kindly provided the TIGR database of radiosonde profiles. Thanks are given to Otto Hyvärinen, Finnish Meteorological Institute, and Øystein Godøy, Norwegian Meteorological Institute (met.no), who have both contributed to the early development of the algorithms, during short-term stays as SAF visiting scientists at SMHI in 1998 and 1999, respectively. The visits were partly financed by EUMETSAT. In addition, the authors thank Colin Jones, SMHI, for comments and suggestions concerning this manuscript. The authors are particular indebted to Marcel Derrien and Hervé Le Gléau at the Centre Météorologie Spatiale, Lannion, Météo-France, for the fruitful cooperation during the early algorithm design and product definition.
Corresponding author address: Adam Dybbroe, Swedish Meteorological and Hydrological Institute, SE-60176 Norrköping, Sweden. email@example.com