The Satellite Application Facility on Climate Monitoring (CM-SAF) is aiming to retrieve satellite-derived geophysical parameters suitable for climate monitoring. CM-SAF started routine operations in early 2007 and provides a climatology of parameters describing the global energy and water cycle on a regional scale and partially on a global scale. Here, the authors focus on the performance of cloud detection methods applied to measurements of the Spinning Enhanced Visible and Infrared Imager (SEVIRI) on the first Meteosat Second Generation geostationary spacecraft. The retrieved cloud mask is the basis for calculating the cloud fractional coverage (CFC) but is also mandatory for retrieving other geophysical parameters. Therefore, the quality of the cloud detection directly influences climate monitoring of many other parameters derived from spaceborne sensors. CM-SAF products and results of an alternative cloud coverage retrieval provided by the Institut für Weltraumwissenschaften of the Freie Universität in Berlin, Germany (FUB), were validated against synoptic measurements. Furthermore, and on the basis of case studies, an initial comparison was performed of CM-SAF results with results derived from the Moderate Resolution Imaging Spectrometer (MODIS) and from the Cloud–Aerosol Lidar with Orthogonal Polarization (CALIOP). Results show that the CFC from CM-SAF and FUB agrees well with synoptic data and MODIS data over midlatitudes but is underestimated over the tropics and overestimated toward the edges of the visible Earth disk.
The Satellite Application Facility on Climate Monitoring (CM-SAF) is part of the European Organisation for the Exploitation of Meteorological Satellites’ (EUMETSAT) SAF network, which comprises eight SAFs dealing with different thematic foci in the field of meteorology and climatology. The SAF network is a “network of networks,” dedicated to tackle the tasks and challenges in the field of meteorology and climatology using satellite data as the main input. Thus, it is the aim of CM-SAF to provide data supporting a better understanding of the climate system (Schulz et al. 2008).
Satellite observations of atmospheric parameters usually require a proper detection of clouds first. The success or efficiency of cloud detection depends on the sensor characteristics (temporal and spatial resolution, spectral bands), the viewing geometry, and surface reflectivity and emissivity. Error propagation of the cloud detection affects other geophysical parameters and it is therefore of vital interest to know the quality of such detection algorithms (Stephens and Kummerow 2007). The cloud fractional coverage (CFC) relies on the cloud detection results and can be used for climate monitoring, although further impact factors (long-term stability of sensors, sensor calibration, intercalibration of sensors of the same type) need to be taken into account before homogeneous data series can be generated.
While the CM-SAF has a long-term commitment for providing relevant datasets, especially for climate monitoring, the aim of the Institut für Weltraumwissenschaften at the Freie Universität Berlin (FUB; in Germany) is to support near-real-time applications and to improve the retrieval of cloud properties from existing and future satellite sensors. In this paper, results from a complementary algorithm approach of FUB are shown in addition to CM-SAF results.
Although accuracy and precision of satellite-based time series might be lower than existing and corresponding datasets derived from ground-based measurements, satellite data are of high importance in those areas where ground-based measurements are not available. Dedicated effort is needed to generate homogeneous, stable, and accurate datasets with high spatial resolution from recent, current, and future satellite sensors. Then, such time series of satellite-derived quantities can, for example, be used for trend studies and for the detection of climate changes (Ohring et al. 2005).
CM-SAF concentrates on parameters that describe the global energy and water cycle. These are cloud parameters, radiative fluxes at ground and the top of atmosphere, and humidity parameters. CM-SAF exploits data of the satellite series of the U.S. National Oceanic and Atmospheric Administration (NOAA) and EUMETSAT, with the following sensors: Advanced Microwave Sounding Unit (AMSU), Advanced Television and Infrared Observation Satellite (TIROS) Operational Vertical Sounder (ATOVS), Advanced Very High Resolution Radiometer (AVHRR), and the Global Earth Radiation Budget (GERB) (Harries et al. 2005) and Spinning Enhanced Visible and Infrared Imager (SEVIRI) radiometers (Schmetz et al. 2002), respectively. The focus of CM-SAF is on the generation of climate data records although previous and ongoing work is also dealing with the generation of less demanding environmental data records, in terms of precision and accuracy [see Colton et al. (2003) for a detailed definition]. Therefore, most of the products are available as daily and monthly averages rather than on an individual orbit or time slot basis.
The first attempts to generate long-term data series, especially of cloud parameters derived from satellite measurements (AVHRR and Meteosat), go back to the early 1980s when the International Satellite Cloud Climatology Project (ISCCP) started its work (Rossow and Garder 1993). Other precursor datasets are, for example, the AVHRR Pathfinder Atmosphere (PATMOS) dataset (Jacobowitz et al. 2003), the Swedish Meteorological and Hydrological Institute (SMHI) Cloud Analysis Model using Digital AVHRR Data (SCANDIA) cloud climatology (Karlsson 2003) over Scandinavia, and the European Cloud Climatology (ECC; Meerkötter et al. 2004), which were all derived from AVHRR observations. SCANDIA has already been used to elucidate possible weaknesses of cloud parameterizations in regional climate models (Karlsson et al. 2008). Our study provides the validation of two SEVIRI cloud detection algorithms with ground-based synoptic data. Additionally, we performed an initial comparison of results from other satellite observations.
3. SEVIRI instrument and data
SEVIRI is a spin-stabilized radiometer in geostationary orbit with 12 narrow-band spectral channels in the visible, near-infrared, and thermal-infrared part of the spectrum [see, e.g., Schmetz et al. (2002) for a list of spectral characteristics of channels]. The ground pixel size of SEVIRI channels 1–11 is about 3 km × 3 km at the subsatellite point and increases to more than 3 km × 5 km over Europe. The nominal temporal resolution of measurements is 15 min. The visible Earth disk as seen by SEVIRI is contained in 3712 × 3712 single pixels. We analyzed SEVIRI level-1.5 data (High Rate Information Transmission formatted) from Meteosat-8. The data stream contains actual calibration coefficients for thermal channels and is calibrated, radiometrically corrected, and georeferenced (EUMETSAT 2006).
The CM-SAF cloud fractional cover and other cloud parameters are provided at a spatial resolution of 15 × 15 km2 on a sinusoidal projection. Cloud coverage is derived from cloud detection results on a pixel-by-pixel basis. The pixel-based cloud detection is also used for generating a cloud mask, which is required for other cloud parameter retrieval algorithms. Therefore, we focused on both the validation of CFC and the validation of the cloud detection scheme on a pixel basis. We analyzed only hourly SEVIRI data (of the year 2006) because of storage capacity limitations. It was decided prior to routine operations to store slots at 45 min past the hour to provide those satellite observations closest to standard synoptic observation times in Europe. Most synoptic observations are either made hourly or in 3–6-h intervals. SEVIRI starts each 15-min scanning sequence at the South Pole so that local observation times over Europe are close to the hour.
A typical set of cloud parameters that can be derived from imager data such as AVHRR and SEVIRI comprises cloud fractional cover, cloud-top height, cloud-top temperature and pressure, cloud type, cloud-top albedo, cloud optical thickness, cloud liquid water path, and cloud phase. A number of well-established operational retrieval methods of cloud parameters from various satellite sensors are discussed in the literature. Kriebel et al. (2003) describe the latest improvements of the AVHRR processing scheme over clouds, land, and ocean (APOLLO), which was introduced by Saunders and Kriebel (1988). A more recent article by Dybbroe et al. (2005) presents the retrieval scheme that is used to process NOAA/AVHRR data in both the Satellite Application Facilities on support to Nowcasting and Very Short-Range Forecasting (NWC-SAF) and Climate Monitoring (Schulz et al. 2008). An overview of cloud parameter retrievals from MODIS data is given in Platnick et al. (2003). The principles of the algorithms applied to SEVIRI can be found in Derrien et al. (1993) and more recently in Derrien and LeGléau (2005). Typically, these methods allow the retrieval of cloud parameters during both day and night using the visible, near-infrared, and infrared part of the spectrum. Furthermore, the first attempts to derive cloud parameters from polarization-sensitive sensors were made in the late 1990s (Buriez et al. 1997).
a. CM-SAF method
We used the algorithm package MSGv1.2 developed by the French Meteorological Service Météo-France within the framework of the NWC-SAF. The cloud detection algorithm is based on a multispectral threshold technique where thresholds are scene-dependent and dynamically adjusted. The method exploits SEVIRI observations of visible and infrared channels during daylight while only infrared channels can be used under nighttime conditions. Thresholds are based on precalculated radiative transfer simulations stored in lookup tables. The series of tests to be passed allows for separation of clear-sky (no snow/ice), clear-sky (with snow/ice), cloudy, partially cloudy (including also semitransparent clouds), and unclassified pixels (where all tests failed). Subsequently, the final cloud mask is generated [see Derrien and LeGléau (2005) for more details]. Essential further input parameters are geographical data (land use, topography, etc.) and numerical weather prediction (NWP) analyses at 3-h temporal resolution. We take the latter from the “GME” model (Majewski et al. 2002) at a horizontal resolution of about 40 km and with a vertical representation using 40 atmospheric layers between the ground and the topmost layer at 10 hPa.
b. FUB cloud detection method
The application of neural networks (NNs) is an established approach to derive information from satellite observations in an efficient and precise manner (Loyola 2006). The FUB cloud detection scheme is based on the analysis of spectral and temporal information from SEVIRI observations by such artificial neural networks. In particular, the assumed clear-sky brightness temperature (ACSBTE) of the 10.8-μm channel as estimated from analyses of its temporal evolution is a central input parameter to the established neural networks. The ACSBTE estimation method is self-organizing and does not depend on any auxiliary data. It is based on assumptions regarding the smoothness of the surface temperature diurnal cycles, their possibility to change with time, and the fact that clouds generally appear colder than the underlying surface in the 10.8-μm channel. The accuracy of ACSBTE values has been found to be ±3.3 K (Reuter 2005). Other NN input parameters are viewing and solar geometry information and measurements of the SEVIRI channels at 13.4, 12.0, 10.8, 8.7, 3.9, 1.6, 0.8, and 0.6 μm. The ACSBTE approach is especially independent from NWP model data. Several networks depending on the complexity of input parameters (e.g., for daytime and nighttime schemes) were established, which are all based on a multilayer perceptron architecture, holding one hidden layer of 25 or 20 neurons. The necessary NN training and test datasets were created by manual classifications of clear-sky and cloudy pixels. Simulations with a radiative transfer model (Rathke and Fischer 2000) were utilized further to ensure the physical relevance of the NN input parameters and to determine the sensitivity of the neural network with respect to different cloud types.
The output of the network represents the probability of cloud coverage (between 0 and 1) at pixel level according to the test and training dataset. This means that 90% of all elements of the training and test datasets resulting in an NN output of 0.9 have in fact been (manually) classified as cloudy. Assuming a representative and error-free training and test dataset, the NN output can then be interpreted as a mathematical probability that a satellite pixel is cloud covered.
A binary cloud mask can then be created by applying a threshold method. Here, we applied a threshold of 0.5, classifying each pixel as either being cloud free or cloud covered. All FUB cloud fraction results shown in the following were derived from the initial cloud detection output instead of the binary cloud mask. The binary cloud mask was solely used for the calculation of the skill scores (see further explanation in section 6a). More details about the FUB cloud detection scheme can be found in Reuter (2005).
5. Validation concept and datasets
The goal of the study is to provide validation results for 1 yr (here 2006), which allows one to investigate cloud coverage results under all geophysical scenarios and for all seasons. A major limiting factor of validating satellite-based cloud parameters is the availability of independent measurements of the same parameter at suitable temporal and spatial distribution. To partly compensate for the lack of ground-based validation data over sea surfaces and over land surfaces outside Europe, we included results of two satellite–satellite intercomparison studies, involving data from the Moderate Resolution Imaging Spectroradiometer (MODIS) on board the Terra and Aqua spacecrafts and Cloud–Aerosol Lidar with Orthogonal Polarization (CALIOP) on board the Cloud–Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO) platform.
a. Regional distribution of synoptic stations
We used only manual-operated land stations and ship observations but excluded all automated ceilometer measurements (e.g., buoy measurements). Ceilometers have a very narrow flare angle and therefore allow only the observation of a small part of the visible half-sphere above their position. Even when averaging over one hour of measurements, results are only representative for a narrow band depending on the prevailing wind direction. In contrast to this, synoptic observations are only limited by the range of vision, usually between 30 and 50 km (Henderson-Sellers et al. 1987). This becomes important when comparing cloud coverage measurements to satellite pixels with diameters of at least 3 km, where homogenous cloudiness within a pixel cannot be presumed.
The geographical distribution of synoptic stations on the Meteosat disk is depicted in Fig. 1 for squares covering 232 × 232 SEVIRI pixels (∼700 km on a side at nadir). We use these boxes only for purposes of clarity of figures. The entire MSG disk can then be represented by 16 × 16 such squares. As expected, the majority of stations is located in northern midlatitudes over land while there is a lack of ground-based measurements over water surfaces in large parts of Africa and the visible parts of eastern South America.
b. Preparation of validation datasets
We compared instantaneous results (i.e., results from individual observations or slots) and the CM-SAF daily and monthly mean CFC products with synoptic data. Note that instantaneous results are derived on a pixel basis. Therefore, their spatial resolution and the associated geographic coverage are determined by the SEVIRI viewing geometry. Daily and monthly mean results are, however, provided on the above-mentioned CM-SAF grid of 15 × 15 km2.
While instantaneous results can be compared relatively easily by spatiotemporal collocation, more effort is required for averaged quantities. The CM-SAF standard products cannot be used because corresponding synoptic measurements are not available. We therefore first needed to calculate daily and monthly means for those satellite pixels and ground-based measurements being close in time and space. The daily collocation datasets were subsequently used to generate monthly mean products. All validation datasets were generated at the same spatial resolution as the CM-SAF standard product, which is 15 × 15 km2 in sinusoidal projection. Our comparison of average products is based on synoptic records from those stations reporting at least six measurements per day at 20 days per month. As a consequence, ship measurements were mostly filtered out. A further restriction was the availability of cloud fractional cover results from both satellite retrievals (as described in section 4).
When synoptic observations are compared to satellite measurements, the strongly differing observation techniques have to be considered properly, as noticed for example by Henderson-Sellers et al. (1987). While the SEVIRI satellite instrument is downward looking with a footprint size between 3 and 15 km (edge length), depending on the observation zenith angle, the ground-based observer is upward looking, reporting one observation representative of the visible part of the sky at his position. Synoptic observations provide the cloud fraction in octa (1 octa = 1/8). Here, 0 octa stands for clear sky, while 8 octa stands for completely overcast sky. Observation rules are, however, such that as soon as a cloud is visible, 1 octa has to be reported while as soon as the sky is visible between clouds, at most 7 octa are reported. Between 2 and 6 octa, the observation should represent the actual cloud fraction above the synoptic station. Synoptic observations are made at fixed times of the day at the stations of the national synoptic observation networks. Early investigations of Mohr (1971) prove that the synoptic cloud coverage estimation differs from linearity and varies from daytime to nighttime (Hahn et al. 1995).
Furthermore, CM-SAF and FUB provide different quantities: the CM-SAF cloud coverage results (at the 15-km resolution) are calculated from individual cloud mask images and the individual cloud mask value for each pixel can be clear sky, cloudy, fractional cloudy, snow/ice (on the ground), and unprocessed. The binary cloud mask then classifies all clear-sky and snow/ice-covered pixels as “clear” and all others as “cloudy.” The output of the FUB cloud detection algorithm at pixel level, however, represents the probability that a SEVIRI pixel is cloud covered. As stated in section 4b, the final cloud mask is then generated from this probability by applying a threshold of 0.5 that separates clear and cloudy pixels.
c. Statistical quantities
The Kuiper skill score (KSS; Hanssen and Kuipers 1965) determines the probability that a predicted event occurs, relative to its casual occurrence. Here, we apply it to satellite measurements (the predicted value) and ground-based observations (the synoptic data). Note especially that the Kuiper skill score refers to single pixel-based observations. The aim is to determine to what extent
each cloud-free case (synop) is detected as cloud free (sat),
each satellite pixel classified as cloud free is cloud free (synop),
each cloud-covered case (synop) is detected as cloudy (sat), and
each satellite pixel classified as cloudy is cloud covered (synop).
We use a contingency matrix (Table 1) that contains the number of observations derived from synop–sat being cloud free–cloud free, cloud free–cloudy, cloudy–cloud free, and cloudy–cloudy.
Note that the contingency matrix contains only results from unambiguous synoptic observations that are 0, 1, 7, and 8 octa. Then, using a, b, c, d from Table 1, the Kuiper skill score is calculated as follows:
while the hit rate that is a measure for the proportion of correct measurements (predictions) reads
In contrast to the hit rate, the KSS is well suited to estimate the skill of quantities that are not symmetrically distributed, as in, for example, the cloud fractional cover (cloudy scenes occur more frequently than cloud-free scenes). We further define quantities based on the entries of the contingency matrix as follows:
Ncf,cc = number of all synoptic reports with 0, 1, 7, or 8 octas,
P(cfsa|cfsy) = a/(a + b); the conditional probability of the satellite cloud detection classifying a scene as cloud free, given a cloud-free synop report,
P(ccsa|ccsy) = d/(c + d); the conditional probability of the satellite cloud detection classifying a scene as cloud covered, given a cloud-covered synop report,
P(cfsy|cfsa) = a/(a + c); the conditional probability of a cloud-free synop report, given a cloud-free satellite classification,
P(ccsy|ccsa) = d/(b + d); the conditional probability of a cloud-covered synop report, given a cloud-covered satellite classification,
= average CFC of all synoptic records (also 2, 3, 4, 5, 6 octas), analog for , and
bias = − .
The standard deviation σ (here, the bias-corrected root-mean-square error) is calculated using
where N is the number of observations. Thus, the bias and the standard deviation are determined using all records while the Kuiper skill score and all other statistical parameters defined above only rely on cloudy and cloud-free pixels. Instantaneous results are mainly assessed by the KSS and the bias while averaged products are described by the standard deviation and the bias.
d. Calculation of the cloud fraction
The CM-SAF average cloud fraction (daily and monthly mean) at the target resolution (15 km) above a synoptic station was calculated from the results of the cloud detection at SEVIRI pixel level using
The factor ffrac in Eq. (4) controls the quantitative contribution of partially cloudy pixels to the total SEVIRI cloud coverage. It acts like a tuning or bias correction factor. A recent analysis of SEVIRI cloud fractional cover results and synoptic records over Europe showed that the bias is minimized if the cloudiness of partially cloudy SEVIRI pixels ( ffrac) is assumed to be 0.75 (Karlsson et al. 2007). We conclude that these ambiguously classified pixels seem to be dominantly cloud covered. On the other hand, the tuning factor depends also on the synoptic measurements involved. Therefore, we did not apply such tuning to the results discussed in the following section 6a. Instead, partially cloudy pixels were counted as being fully cloud covered ( ffrac = 1), although this is equivalent to a systematic overestimation of CM-SAF CFC results. It is, however, motivated by the fact that the SEVIRI class of partially cloudy pixels also contains semitransparent, optically thin ice clouds, which is due to a corresponding BT12.0–BT10.8-μm threshold test that is especially sensitive to such clouds. If, however, a climate data record (Colton et al. 2003) of cloud parameters shall be derived from the satellite observations, such inherent bias must be considered for further applications (e.g., trend studies; Ohring et al. 2005).
The calculation of FUB skill scores relies on the binary cloud mask while the bias (relative to synoptic measurements) is derived using the initial cloud detection results (probabilities):
We provide the analysis of CFC results for all synoptic stations. Furthermore, we give detailed results for land and sea stations, stations along the coast lines and in mountainous terrain, daytime and nighttime measurements, and observations under twilight conditions. This allows a better view on the performance of the satellite measurements for different geophysical scenarios. The categories “day,” “twilight,” and “night” are separated by means of ranges of the sun zenith angle (sza): sza(day) < 85°, 85° < sza(twilight) <90°, 90° < sza(night). “Mountain” stations are higher than 2000 m MSL (to group stations that are frequently surrounded by snow-covered surfaces). The class “coast” is defined for synoptic stations and pixels closer than 3 km to the coastline (∼1 SEVIRI pixel).
a. Validation against synoptic data
1) Validation of instantaneous SEVIRI results
First, we compare results of single SEVIRI pixels with corresponding synoptic measurements (i.e., results of the cloud detection algorithms). The comparison involves those SEVIRI pixels being closest in space and time to synoptic observations.
Detailed results for different underlying ground and daytime and nighttime measurements can be found in Table 2. Special emphasis is with the row labeled “weighted,” which contains results of an analysis that tried to harmonize the temporally and spatially unbalanced distribution of synoptic stations. There are many more stations over land than over sea, also over different regions on the visible disk (many stations in Europe, but few elsewhere), and more measurements are done during daytime. Simply put, the method assigns a higher weight to underrepresented stations when calculating the statistics. On the one hand these results are then more representative for the full disk, as it would be if stations are equally distributed in the spatial domain. However, one has to take into account that only few stations over, for example, water surfaces are then representatively used for large areas and this may cause large uncertainties of CFC results.
The agreement of CM-SAF CFC and synoptic observations is reasonably good. The bias is around 0.05 (scenario “overall”) and the Kuiper skill score is about 0.78. False clouds are indicated if entries of P(ccsy|ccsa) are lower than entries of column P (ccsa|ccsy), but false clear pixels are apparent if entries P (cfsy|cfsa) are lower than entries of column P (cfsa|cfsy).
Results in column P (cfsa|cfsy) show that satellite observations over land agree more frequently (0.86) with cloud-free synoptic observations than over sea (0.63). As a matter of fact, we observe an overestimation of the SEVIRI cloud fractional cover over sea and the bias is more than 3 times higher than for observations over land surfaces (0.13 versus 0.04).
As expected, daytime measurements agree better than nighttime measurements, due to the additional information from SEVIRI solar channels and the reduced occurrence of temperature inversions in the boundary layer. Such inversions hamper the detection of clouds during the night (e.g., the probability that the satellite hits a cloud-free synop report is 0.88 for day but 0.81 for night). Note, however, that under certain conditions, nighttime synoptic measurements are also of lower quality. Semitransparent cirrus clouds then often remain undetected from ground-based observations (Hahn et al. 1995). The performance under twilight conditions is good, although the low negative bias of −0.02 indicates some increase in the amount of undetected clouds, while records of stations in mountainous terrain are of acceptable quality (bias = 0.05). Here, false clouds are detected and the satellite measurements overestimate the “true” cloudiness. This overestimation may partly be a result of inadequate NWP surface temperature data, which typically do not resolve the strong orographic structure over mountainous terrain. Another possible reason, although not so crucial for SEVIRI, due to the 3.9-μm channel are snow-covered scenes and correspondingly misclassified pixels. Validation results for coastal stations and over mountainous terrain, although satisfying, may be biased by the geographical distribution of stations, and thus by the viewing geometry. The European area where the majority of such stations is located is always seen under moderately high observation zenith angles.
Weighted results are not as good as overall results (lowest KSS of only 0.66), which is a hint that the cluster of reliable stations over land in the northern midlatitudes is the driver of the good performance of CFC results. This can be seen also in Fig. 2 (left) where the spatial distribution of the bias of CFC retrievals is shown. The best performance is found for Europe and the Mediterranean and North and South African regions (bias low, Kuiper skill scores high; Fig. 2, right), while the positive bias increases with the satellite observation angle, thus toward the edges. A negative bias is found for the tropical regions and over adjacent sea surfaces. This does not necessarily mean that the SEVIRI retrieval is less accurate here. Also, the quality of synoptic measurements might be lower in the tropics. Surface observations in cases of dominantly convective cloud cover may tend to overestimate the cloudiness because cloud-free scenes in between vertically extending clouds are not seen by the observer at a slanted view while the small satellite viewing angles increase the chance to find cloud-free spots (Mohr 1971). On the other hand, the SEVIRI retrieval initially was not designed for the moisture-loaded tropical atmospheres and additional tuning of threshold settings may partly resolve the observed bias. Further investigation is required to fully explain the large differences in this region.
The corresponding results of the FUB retrieval are different (Table 3): here, the bias is negative for most scenarios (satellite observations underestimate the synoptic measurements) and the bias only moderately increases toward the edges of the full disk where a slight overestimation of CFC is observed (Fig. 3). The overall bias is low, because of the large number of land stations without a significant bias. The weighted scenario, however, shows a distinct negative bias. Skill scores are reasonably high but the performance over semiarid areas in the northern tropical belt is similarly poor (Sahel zone and bordering Saharan Desert). Skill scores are lower than corresponding CM-SAF results under land, night, and twilight conditions, but are higher over sea surfaces. The overall bias is close to zero while the weighted scenario shows a negative bias of about −0.07 (+0.05 for CM-SAF).
Further analysis of CM-SAF CFC results as a function of the satellite zenith angle and the mean local time of observation is presented in Fig. 4. The bias is negative for near-nadir viewing but increases almost continuously to positive values for high observation zenith angles (Fig. 4, left). The Kuiper skill score reaches a maximum between satellite zenith angles of 35° and 70°. Note also the parallel increase of the KSS and the number of synoptic observations, which is due to the large number of reliable synoptic observations over European land surfaces. Furthermore, we observe a weakly pronounced daily course of the bias but values are not symmetrically distributed around noon (Fig. 4, right). There is a minimum during the morning hours but the bias always remains slightly positive. The corresponding Kuiper skill score reaches a maximum during the morning hours and remains high during daylight hours but decreases for nighttime observations. The number of synoptic observations is twice as high during daylight hours than during the night.
Corresponding results of the FUB retrieval are also shown in Fig. 4. The course of the FUB bias and KSS as functions of the observation zenith angle is similar to the CM-SAF results, although the KSS is slightly lower and the bias larger for near-nadir viewing conditions. The variation of the bias is, however, less pronounced and is positive but low for high observation zenith angles (Fig. 4, left). The daily variations of FUB bias and skill scores (Fig. 4, right) are larger than for CM-SAF. A positive bias during nighttime hours is followed by a negative bias around noon while the Kuiper skill score peaks around noon. The time period with large skill scores is generally smaller than for CM-SAF results.
We also looked into the yearly dependency of bias and KSS (Fig. 5, left). The bias of CM-SAF results shows no distinct yearly variation and remains slightly positive throughout the analyzed period. The Kuiper skill score is slightly higher during the summer half-year from April to October. The FUB results on the other hand are different: both the bias and the Kuiper skill score show a more pronounced seasonal dependency with a slightly negative bias during the summer and a positive bias at the beginning of 2006. The skill score is low in the first quarter of 2006 but then increases to the same level as observed for CM-SAF. The seasonal behavior of KSS and bias seems to be strongly influenced by the fact that nighttime observations are dominating the winter season while daytime observations dominate the summer.
We also analyzed the performance of the satellite-based cloud coverage retrievals as a function of the cloud-base height, which is available as part of the synoptic records. Here, the CM-SAF performance is slightly better than the FUB performance. In general, the bias is negative for both very low and higher cloud-base heights (Fig. 5, right). The detection of very low clouds is limited because of the low contrast between modeled clear-sky brightness temperature and measured cloud-top brightness temperature. The decreased probability for detecting clouds with a cloud-base height ≥2.5 km may be caused by the parallax effect, which gets increasingly important for high clouds observed under large viewing zenith angles. This effect increases even though the problem of collocating satellite and ground station data is strongly mitigated, as only unambiguous synoptic observations of 0, 1, 7, and 8 octa (representative of a region that is generally much larger than an average SEVIRI pixel) were taken into account for this analysis. Furthermore, cirrus clouds typically fall into this category of clouds with cloud-base height ≥2.5 km, which is a hint that it is more difficult to detect these clouds.
2) Validation of daily and monthly mean values
Since CM-SAF provides daily and monthly mean averages of the cloud coverage as standard products for public use, we also show validation results involving these averaged products. We prepared daily and monthly cloud coverage validation results for the study as described in section 5b. These products are similar to standard CM-SAF products but they are calculated using only a small fraction of the full hourly resolution SEVIRI dataset (i.e., restricted to synoptic observation times). Nevertheless, only those datasets fulfilling the minimum data density requirements for building CM-SAF monthly and daily products (6 obs/day, 20 days/month) were taken into account. We calculated bias and standard deviation of daily and monthly mean products. Most of the shown results are over land surfaces. Detailed results are summarized in Table 4.
Figure 6 shows the histograms of the daily mean cloud coverage results of the CM-SAF approach (black) and the FUB method (gray) relative to synoptic measurements in octa. Again, it becomes obvious that the CM-SAF cloud coverage results tend to be higher than corresponding synoptic measurements while the distribution of FUB results is almost symmetric around zero. Consequently, the FUB bias is close to zero. The standard deviation of FUB results over all surface types and for all categories is in the same range as for corresponding CM-SAF results.
A similar analysis of monthly mean cloud coverage results derived from SEVIRI observations and synoptic measurements gave similar results. The bias remains in the same order while the standard deviation of monthly mean results is on average only half the values seen before (see Table 4 for numbers).
b. Comparison of CM-SAF results against MODIS data
Satellite-to-satellite comparison may help to identify systematic errors as well as strengths and weaknesses of retrieval algorithms. It also avoids problems with multilayered cloudy scenes when a comparison of ground-based and satellite-based cloud observations might not provide meaningful results.
MODIS is a scanning, nadir-looking radiometer with narrow spectral channels in the visible, near-infrared, and infrared spectral domains. A general overview about MODIS cloud products is given in Platnick et al. (2003). We used cloud fraction data from Collection 5 MOD06 and MYD06data (Menzel et al. 2006), which are presented as the percentage of cloudy 1-km pixels found in 5 × 5 1-km pixel groups. The basic MODIS cloud mask retrieval of MODIS footprints can be found in Ackerman et al. (1998). The cloud mask is then taken to calculate the cloud amount that was finally used in this study. We confined our study to observations on 1 August 2006 including 24 SEVIRI slots and about 12 MODIS passes over the Meteosat disk.
Typically, there is a time difference between MODIS overpass times (every 100 min) and hourly SEVIRI observations and we try to compensate the possible movement of cloud structures by further sampling the MODIS data into 25 × 25 km2 boxes applying a boxcar average filter. The computation of skill scores then included only MODIS pixels that were either almost cloud free (≤1 octa) or almost fully cloud covered (≥7 octa) in such a grid box. Since we restrict the comparison to almost homogeneous scenes (either cloudy or cloud free), this running average will not bias the comparison in our favor. We then compared these MODIS gridbox results with instantaneous nearest-neighbor SEVIRI measurements, which were not further sampled. Note that the calculation of the bias was done for all MODIS pixels (i.e., there was no restriction with respect to the cloud coverage).
We observe a negative bias in the tropical belt, especially over land surfaces, while SEVIRI results tend to be higher over water surfaces and generally at larger observation zenith angles (Fig. 7, left). There is a small negative bias over land surfaces in the midlatitudes while the bias is positive over water surfaces in the same latitudinal belts. The skill scores are generally high over water surfaces but, as expected, values decrease toward the edges of the Earth disk where the bias becomes large (Fig. 7, right). Here, the MODIS and the SEVIRI observation geometries differ strongly and SEVIRI tends to overestimate the cloudiness. The viewing pathlength is longer for large viewing zenith angles, resulting in a greater probability that clouds are present, which improves the detection of thin clouds. Also, cloud structures in the foreground and closer to the sensor in the line-of-sight may overlap other cloud structures (or the cloud-free sky) in the background. In the following we will call this combined effect “scenery effect.”
The low skill scores over the eastern Mediterranean are due to the fact that the area remained almost cloud free throughout the entire day (Fig. 8), leading to a statistical distribution of clear and cloudy pixels that is very different from the mean statistical behavior of the squares defined. More data would be required to reduce the scatter and to provide conclusive results for this area. In such cases it would be more appropriate to describe the coincidence of results by the hit rate. However, we kept the KSS to provide comparable results in Figs. 2, 3, and 7. Since our comparison is based on several millions of collocated SEVIRI–MODIS pixels covering nearly all climatic zones during summer (Northern Hemisphere) and winter (Southern Hemisphere), we believe that our results presented in Table 5 are still representative even though several meteorological extreme situations like Saharan dust outbreaks over the tropical Atlantic Ocean were not observed at this specific day.
There is some evidence that convectively formed cirrus cloud shields (mainly caused by convective outflow/divergence over convective cells) in the tropical region are more frequently interpreted as being fractional clouds (only cloud-contaminated pixels) by SEVIRI while the MODIS retrieval gives opaque clouds. Partly cloudy pixels in the inner tropical convergence zone are typically classified using the brightness temperature difference of channels 12.0 and 10.8 μm. This difference is very sensitive to thin cirrus, which occur frequently in this region. However, as mentioned in section 5b, we treat fractional cloudy pixels as being fully cloud covered and therefore we cannot explain observed differences by such a misclassification of pixels. Results (bias, skill scores) are detailed in Table 5.
Bias (0.04) and KSS (0.85) are comparable to corresponding results involving synoptic data. A relatively high positive bias is both found over sea surfaces and during the day, while nighttime measurements are bias free at a KSS of 0.83.
c. Comparison of CM-SAF results against CALIOP data
The CALIPSO satellite mission was launched in April 2006 and the first data became available in August 2006. CALIPSO mission objectives are to provide profile information about cloud and aerosol particles and corresponding physical parameters. CALIPSO’s payload comprises a polarization-sensitive active lidar (CALIOP), a passive Infrared Imaging Radiometer (IIR), and a visible Wide Field Camera (WFC). CALIPSO is part of the A-train, a series of satellites flying in formation in sun-synchronous, low polar orbits 700 km above the ground. CALIOP measures the backscatter intensity at 1064 nm while two other channels measure the orthogonally polarized components of the backscattered signal at 532 nm. The WFC is a nadir-viewing imager with a single spectral channel (620–670 nm), selected to match band 1 of the MODIS/Aqua instrument. The IIR is a nadir-viewing, nonscanning imager having a 64 km × 64 km swath with a pixel size of 1 km. The CALIOP beam is nominally aligned with the center of the IIR image. The infrared imager provides measurements at three channels in the thermal infrared window region at 8.7, 10.6, and 12.0 μm. See Winker et al. (2007) for more information about the CALIOP instrument and McGill et al. (2007) and references therein for details about the cloud parameter retrieval.
It shall be emphasized that SEVIRI and CALIOP basically measure different quantities: SEVIRI passively retrieves emitted and reflected radiation from the Earth–atmosphere system in narrow wavelength bands while CALIOP results are based on the analysis of backscattered (polarized and unpolarized) radiation at single wavelengths from an active remote sensing system in the VIS and NIR spectral bands. One has to take into account that the active CALIOP instrument is much more sensitive to cloud contamination than the passive imager instruments. CALIOP footprints (enhanced for visibility) and the local mean time of observations are shown in Fig. 9 (left). It becomes obvious that only midday and midnight observations can be compared with SEVIRI measurements, which is a consequence of the sun-synchronous polar orbit of CALIPSO.
We compared SEVIRI and CALIOP results from 1 August 2006 over the visible SEVIRI disk. CALIOP footprints are small stripes of around 100-m width. These footprints are overlaid by the IR imager data. To minimize the impact of a possible time difference of about 30 min between CALIOP and SEVIRI observations, we used the IR imager data at 10.6 μm to first select homogeneous scenes that were then compared with SEVIRI results. Therefore, we defined a stripe of 21 pixels around the lidar track (10 at each side of the lidar track plus a center pixel matching the track) and calculated the standard deviation of the IIR radiances. A low standard deviation is then a signal for homogeneous scenes. We defined a threshold value of 0.5 Wm−2 μm−1 sr−1 (for the standard deviation) to remove stripes with large scatter from the collocation study. The threshold value is a compromise between the number of remaining pixels and observed inhomogeneities. The CALIOP level-2 product also provides information about the number of atmospheric layers that can be distinguished with respect to their physical properties. To avoid ambiguities of interpreting multilayered cloud scenes, only those pixels were considered where just one cloud layer was found. As an additional criterion for homogeneity, we took only those CALIOP measurements with either no clouds or just one cloud layer within 21 km along the line of measurements. Note that this strict filtering was implemented to obtain a dataset containing fully cloud-free or fully cloud-covered scenes only. Since we filtered inhomogeneous results in the previous step, we could apply a simple nearest-neighbor approach in space and time collocating SEVIRI and CALIOP pixels. This means for each remaining CALIOP pixel fulfilling the homogeneity criteria we search for the closest SEVIRI pixel with the smallest temporal difference between observation times. As CALIOP pixels are much smaller than SEVIRI pixels, several different CALIOP pixels may fall into one SEVIRI pixel. Because of the sensors’ different viewing geometries, this effect increases for large viewing zenith angles of SEVIRI. To avoid overrepresentation of these areas, we therefore rejected such duplicated pixels.
Then we compared SEVIRI and CALIOP results in the same way as described in section 6b. Note, however, that CALIOP observations of a single day are not homogeneously distributed over land and sea surfaces. As a first result, we see that SEVIRI cloud fractional cover results are lower than CALIOP results over land surfaces but are higher over water surfaces (Fig. 9, right), similar to the MODIS–SEVIRI results. There are low probabilities that cloud-free SEVIRI pixels are confirmed by corresponding CALIOP observations [column P(cfca|cfse), 0.64, in Table 6] and that CALIOP cloud-free scenes are classified as cloud-free SEVIRI observations [column P(cfse|cfca), 0.60, in Table 6]. This may partly be explained by the characteristics of the CALIOP instrument: on the one hand, CALIOP is much more sensitive to clouds than SEVIRI. One of CALIOP’s strengths is its potential to detect extremely thin clouds. Therefore, it can be expected that a large number of cloud-free SEVIRI classifications cannot be confirmed by CALIOP. On the other hand, CALIOP’s small footprint size allows for detection of small-scale broken clouds but also of small-scale cloud-free holes in a cloud deck. These cases should have been filtered out by the homogeneity criteria mentioned above. However, the screening for homogeneous cases using the IIR measurements can only partly compensate for this.
There are also different results for daytime and nighttime observations: while SEVIRI overestimates the cloudiness during the day, the opposite is true for nighttime observations. A negative bias is found over tropical zones while for areas outside the tropics the bias is mostly neutral (over land) or positive and increasing toward the edges of the visible Meteosat disk, which is in line with previous results. Skill scores are generally low without a pronounced regional dependency, except over high latitudes where values decrease rapidly. Here, the increasing pixel size of SEVIRI and the scenery effect (for SEVIRI) make it difficult to reasonably compare the soundings of the nadir-viewing CALIOP instrument and the geostationary SEVIRI sensor.
We performed a comparison of SEVIRI-based cloud coverage results with synoptic measurements and the corresponding results of two other satellite instruments, MODIS and CALIOP. We analyzed instantaneous results at pixel resolution as well as daily and monthly mean products at reduced spatial resolution.
SEVIRI-based cloud fractional cover results over the entire Meteosat disk typically agree within 1-octa cloudiness with ground-based synoptic observations. The CM-SAF results tend to overestimate the cloud coverage over sea where contrasts between clouds and the ground in the solar spectral range are high. In the thermal spectral range the spatially homogeneous and smoothly changing sea surface allows the use of well-defined thresholds, providing a better cloud detection compared to over land surfaces. However, this also leads to the identification of a higher number of pixels with only partial cloud coverage, which complicates the calculation of the total cloud cover. Furthermore, the CM-SAF retrieval overestimates the cloudiness at large observation angles of more than 75° while the opposite effect is observed over the tropical belt where observations are made in near-nadir viewing mode. Differences are in both cases up to 20%. The alternative FUB algorithm provides similar results over land surfaces with a less pronounced increase of the bias toward the edges of the SEVIRI disk but provides an underestimation of the cloud coverage over water surfaces. The overall performance of instantaneous CM-SAF CFC results is expressed by a Kuiper skill score of 0.78 and a positive bias of 0.05. The corresponding FUB results are 0.75 and −0.01.
Daily and monthly mean values were further calculated for both ground-based and space-based records. The results confirm that CM-SAF satellite measurements overestimate the cloud coverage over sea surfaces while some underestimation is found over land. We also found a negative bias of CFC in the tropical belt that is compensated (for the “overall” scenario) by a positive bias toward the edges of the Meteosat disk. The best performance of CFC is found over the northern midlatitudes, mainly over European land surfaces. We could not find a seasonal trend of the bias but the standard deviation is higher during wintertime, which was expected since observation conditions for SEVIRI are then less favorable at high latitudes. There is a seasonal dependency of both the bias and the standard deviation of FUB results. We found a positive bias for the winter period and a negative bias for the summer months while the increase of the standard deviation during the winter is more pronounced than for CM-SAF results. It seems reasonable that the seasonal dependency is caused by a varying fraction of the number of daytime to nighttime synoptic observations during the year. In our opinion, it remains difficult to favor one of the algorithms analyzed in this study for climate applications. The CM-SAF approach certainly has the advantage of a seasonally constant bias but suffers from different performances over land and sea surfaces. In contrast, these are the weaknesses and strengths of the alternative FUB approach, which in addition does not depend on NWP data.
A comparison of SEVIRI CFC from CM-SAF with MODIS and CALIOP CFC values gave similar results. We observe a trend toward higher SEVIRI CFC values over water surfaces relative to MODIS and CALIOP while the opposite is true over land surfaces mainly in the tropics. It seems that MODIS captures more and especially optically thin cirrus clouds, which is due to optimized narrow sounding channels in the infrared spectral region and the better horizontal resolution. Also CALIOP measurements are more sensitive to the occurrence of clouds and coincident measurements of cloudy scenes are more likely than for cloud-free scenes.
There is some evidence that the spatial and temporal matching of SEVIRI and CALIOP observations is more problematic, which causes lower skill scores in comparison to the MODIS SEVIRI case. CALIOP footprints are much smaller, especially over high latitudes where SEVIRI pixels are large because of the viewing geometry. There the increasing scenery effect for SEVIRI hampers the retrieval, which finally leads to an overestimation of the SEVIRI cloud coverage. It is planned to enhance the comparison against CALIOP data, which are seen as a reference remote sensing system concerning the retrieval of cloud properties.
The pronounced increase of the bias toward the edges of the Earth disk that is seen for all comparisons is not necessarily caused by an erroneous cloud detection. It is at least partly an expected feature caused by the different viewing geometries of a geostationary satellite and a surface observer or a polar-orbiting satellite, respectively. Clouds may be identified correctly by SEVIRI but are projected wrongly so that the horizontal cloud coverage is larger than in reality just by geometrical viewing effects.
Future work will be dedicated to ongoing quality control using synoptic observations, more comprehensive satellite/satellite intercomparison studies, and time series analysis of cloud parameters derived from the series of SEVIRI sensors that will be launched in the following years. Here, we face the problem of calibrating the radiances of slightly different sensors, although identical in construction. Nonetheless it is the mandate of CM-SAF to prepare such corrected time series of, for example, the fractional cloud coverage as the basis of further analysis of climatic changes.
This work was funded by the Satellite Application Facility on Climate Monitoring within the framework of EUMETSAT’s SAF network. The work of the EUMETSAT secretary is greatly acknowledged. We thank the NWC-SAF consortium for providing the MSG/SEVIRI retrieval package. We further thank our reviewers for their helpful and valuable comments to improve this work. The colleagues at Deutscher Wetterdienst kindly provided NWP analysis data. Many thanks also are given to Julia Reuter for proofreading the manuscript.
* Current affiliation: Institute of Environmental Physics, University of Bremen, Bremen, Germany.
+ Current affiliation: EUMETSAT, Darmstadt, Germany.
Corresponding author address: Werner Thomas, Deutscher Wetterdienst, P.O. Box 10 04 65, D-63004 Offenbach, Germany. Email: email@example.com