Latent heat fluxes (LHF) play an essential role in the global energy budget and are thus important for understanding the climate system. Satellite-based remote sensing permits a large-scale determination of LHF, which, among others, are based on near-surface specific humidity . However, the random retrieval error () remains unknown. Here, a novel approach is presented to quantify the error contributions to pixel-level of the Hamburg Ocean Atmosphere Parameters and Fluxes from Satellite Data, version 3.2 (HOAPS, version 3.2), dataset. The methodology makes use of multiple triple collocation (MTC) analysis between 1995 and 2008 over the global ice-free oceans. Apart from satellite records, these datasets include selected ship records extracted from the Seewetteramt Hamburg (SWA) archive and the International Comprehensive Ocean–Atmosphere Data Set (ICOADS), serving as the in situ ground reference. The MTC approach permits the derivation of as the sum of model uncertainty and sensor noise , while random uncertainties due to in situ measurement errors () and collocation () are isolated concurrently. Results show an average of 1.1 ± 0.3 g kg−1, whereas the mean () is in the order of 0.5 ± 0.1 g kg−1 (0.5 ± 0.3 g kg−1). Regional analyses indicate a maximum of exceeding 1.5 g kg−1 within humidity regimes of 12–17 g kg−1, associated with the single-parameter, multilinear retrieval applied in HOAPS. Multidimensional bias analysis reveals that global maxima are located off the Arabian Peninsula.
Besides shortwave and longwave radiative fluxes, the heat transfer between ocean and atmosphere is composed of turbulent sensible and latent heat fluxes (SHF and LHF, respectively). On a global average, LHF represents the primary contributor for compensation of the ocean’s energy gain by radiation fluxes over the ocean (Schulz et al. 1997) and hence for the closure of the surface energy budget. LHF considerably influences the oceanic heat balance and represents a vital source in terms of altering the atmospheric circulation and the overall hydrological cycle on seasonal to multidecadal time scales (Chou et al. 2004). The understanding of the underlying physical processes crucially depends on the ability to accurately measure the ocean surface heat fluxes. The latest assessment report of the Intergovernmental Panel on Climate Change (IPCC), for example, underpins the role of heat transfer between ocean and atmosphere in driving the oceanic circulation. It stresses that flux anomalies can impact water mass formation rates and alter oceanic and atmospheric circulation (IPCC 2013).
Thus, reliable long-term global LHF climate data records are needed to overcome this issue, serving as a verification source for coupled atmosphere–ocean general circulation models and climate analysis (Schulz et al. 1997). Similarly, LHF datasets represent a substantial input component to assimilation experiments, such as the oceanic synthesis performed by the German contribution to Estimating the Circulation and Climate of the Ocean (GECCO; e.g., Köhl and Stammer 2008).
Owing to a large spatial and interannual variability, as well as spatial and temporal undersampling, Andersson et al. (2011) elucidate that in situ LHF measurements remain troublesome over the global ocean. Conclusions within the Fifth Assessment Report (AR5; IPCC 2013) also mention the insufficient quality of in situ observations when it comes to an assessment of turbulent heat flux changes. Although voluntary observing ships (VOS) provide the longest available in situ record, Gulev et al. (2007) stress that VOS-based surface fluxes suffer from uncertainties associated with the ship observations, applied bulk aerodynamic algorithms, and the approach used to produce surface flux fields. Owing to this, random sampling uncertainties in LHF amount to several tens of watts per square meter (W m−2) in poorly sampled high latitudes (Gulev et al. 2007).
Despite global coverage and high temporal resolutions, global atmospheric reanalyses have weaknesses, such as those associated with a lack of spatial detail (Winterfeldt et al. 2010). Reanalysis products are known to exhibit shortcomings in remote regions due to little in situ ground reference data. In consequence, they are dominated by the atmospheric model (Gulev et al. 2007). In well-sampled regions, by contrast, the reanalysis fields are strongly constrained by observations.
To overcome the addressed issues, high-quality remote sensing datasets are of supplementary need. Several of these are currently available, incorporating LHF-related parameters. They comprise, for example, data of the climate Goddard Satellite-based Surface Turbulent Fluxes, version 3 (GSSTF3; Shie et al. 2012); the French Research Institute for Exploitation of the Sea [L’Institut Français de Recherche pour l’Exploitation de la Mer (IFREMER; Bentamy et al. 2003)]; the Japanese Ocean Flux Data Sets with Use of Remote Sensing Observations, version 2 (J-OFURO2; Kubota et al. 2002); the SeaFlux. version 1, dataset (Clayson et al. 2015); and the Hamburg Ocean Atmosphere Parameters and Fluxes from Satellite Data (HOAPS) dataset (Andersson et al. 2010; Fennig et al. 2012). Their retrievals include a bulk aerodynamic algorithm to parameterize LHF in terms of observed mean quantities, that is, bulk variables (e.g., Fairall et al. 2003).
HOAPS is a completely satellite-based climatology of precipitation, evaporation, related turbulent heat fluxes, and atmospheric state variables over the global ice-free oceans. The usefulness of the HOAPS climatology has been tested among numerous intercomparison studies and promising results have been published within Kubota et al. (2003), Bourras (2006), Klepp et al. (2008), Winterfeldt et al. (2010), and Andersson et al. (2011).
Bulk aerodynamic algorithms have a primary dependency on specific humidity . Its accuracy directly impacts the uncertainty of the derived LHF. The Global Climate Observing System (GCOS 2010) has declared the near-surface specific humidity as an essential climate variable (ECV), indicating its prominent role in the context of climate analysis (Prytherch et al. 2014). However, the remote sensing of remains challenging. The retrieval process is complicated, as the measured signal originates from relatively thick atmospheric layers (e.g., Schulz et al. 1997). Several studies have highlighted the importance of the uncertainties in when investigating satellite-based LHF discrepancies (e.g., Andersson et al. 2011; Bentamy et al. 2013; Bourras 2006; Smith et al. 2011), implying a high potential for improvement. Furthermore, satellite validation analysis is per se difficult due to the lack of knowledge of the “truth” (e.g., Zwieback et al. 2012) and the introduction of representativeness and collocation errors, owing to poor spatial coverage of in situ measurements (Scipal et al. 2010).
To improve our understanding of uncertainties in satellite products, the triple collocation (TC) technique (e.g., O’Carroll et al. 2008) has been developed and applied. TC is based on three individual datasets and allows for isolating uncertainties of the underlying datasets. The set of equations resulting from such a single TC analysis permits solving for a maximum of three unknown errors. However, the amount of random uncertainties inherent in the SSM/I instruments (model error and noise error ) and the collocation procedure (random in situ error and collocation error ) equals four.
Within the framework of a random error characterization of HOAPS , it will be demonstrated how to overcome this issue by extending the traditional TC analysis of O’Carroll et al. (2008) to a multiple TC (MTC), based on two triplets of SSM/I and in situ records. This allows for the decomposition of the overall random uncertainty in into estimates of and . Their sum represents the random retrieval error . Terms and are quantified analogously. The results constitute a fundamental basis for a full error characterization of HOAPS LHF-related parameters, which will enhance the analysis potential of HOAPS in future scientific studies.
Section 2 presents the applied data sources in more detail and introduces the MTC method. Section 3 shows the results of the analyses, which include investigations of latitudinal and seasonal error dependencies, as well as their hot spots. Findings are related to recent publications within section 4, which also includes a qualitative comparison of the advantages and drawbacks of the applied data and the MTC approach.
2. Data and methodology
1) HOAPS-S data records
Apart from the sea surface temperature (SST), all HOAPS parameters are derived from intercalibrated Special Sensor Microwave Imager (SSM/I) passive microwave radiometers, which are installed aboard the satellites of the U.S. Air Force Defense Meteorological Satellite Program (DMSP). Therefore, HOAPS provides consistently derived global fields of freshwater flux–related parameters, avoiding cross-calibration uncertainties between different types of instruments. The current HOAPS version includes SSM/I records between 1987 and 2008, during which a total number of six instruments were in operational mode.
The SSM/I measurements are characterized by a conical scan pattern, where the antenna beam intersects the earth’s surface at an incidence angle of 53.1° and the swath width spans roughly 1400 km. The radiometers measure emitted and reflected thermal radiation from the earth’s surface and the atmosphere in form of upwelling microwave brightness temperatures () at four different frequencies, namely, 19.35, 22.2, 37.0, and 85.8 GHz. Whereas the 22.2-GHz channel considers only the vertically polarized signal, the remaining three channels measure both horizontal and vertical polarized signals (Hollinger et al. 1990). The channel footprints vary with frequency, ranging from elliptic 43 × 69 km2 (cross track/along track) at 19.35 GHz to rather circular 13 × 15 km2 at 85.5 GHz. Each instrument completes one orbit within 102 min, implying that approximately 14 orbits per day are performed, allowing for 82% of global coverage between 87.5°S and 87.5°N within 24 h. Because of the inclined orbit of the satellites, a spatial coverage of 100% is reached after 3 days.
Here, the focus lies on the scan-based HOAPS (HOAPS-S), version 3.2, data record (Andersson et al. 2010; Fennig et al. 2012), which contains the HOAPS geophysical parameters in the SSM/I sensor resolution. HOAPS-S is based on a prerelease of the Satellite Application Facility on Climate Monitoring (CM SAF) SSM/I Fundamental Climate Data Record (FCDR). Its extensive documentation, including product user manual, validation report, and algorithm theoretical basis document, is available online (Fennig et al. 2013). Compared to HOAPS-3, HOAPS-3.2 has been temporally extended until 2008 and is based on a reprocessed SSM/I FCDR. This reprocessing included a homogenization of the radiance time series by means of an improved intersensor calibration with respect to the DMSP F11 instrument. Earth incidence angle normalization corrections were applied, following a method described by Fuhrhop and Simmer (1996). Starting with the most recent release (HOAPS-3.2), the HOAPS freshwater flux climatology is now hosted by the EUMETSAT CM SAF, whereupon its further development is shared with the University of Hamburg and the Max Planck Institute for Meteorology (MPI-M), Hamburg, Germany.
The HOAPS near-surface relies on a direct four-channel retrieval algorithm by Bentamy et al. (2003), which is based on a modified version of the two-step multichannel regression model by Schulz et al. (1993) and its refinement by Schlüssel (1996). The underlying inverse model is based on linear regression between ship-based and , the former being linearly related to the integrated water vapor content. In comparison to earlier model versions, considerable regional and seasonal biases were removed due to revised regression coefficients. Compared to Schulz et al. (1993, 1997), Bentamy et al. (2003) achieved a bias reduction of 15% and registered an overall root-mean-square error (RMSE) of 1.4 g kg−1 (originally 1.70 g kg−1).
From 1995 onward, records of up to three simultaneously operating SSM/I instruments are available (see Fig. 2 in Andersson et al. 2010). As the MTC method relies on multiple SSM/I being in operational mode concurrently, the analysis is restricted to the time period from 1995 to 2008, excluding data prior to 1995 due to a comparatively poor in situ data coverage.
2) SWA-ICOADS ship data records
Hourly in situ data originate from the marine meteorological data archive of the German Meteorological Service [Deutscher Wetterdienst (DWD)], supervised by the Seewetteramt Hamburg (SWA, part of DWD). It comprises global high-quality shipborne measurements, as well as data provided by drifted and moored buoys. In the case of data gaps within the SWA archive, the in situ data basis was extended at SWA by available International Comprehensive Ocean–Atmosphere Data Set (ICOADS) measurements (version 2.5; Woodruff et al. 2011). These records contain hourly global measurements obtained from ships, moored and drifting buoys, and near-surface measurements of oceanographic profiles.
ICOADS estimates of are based on wet-bulb temperature measurements, typically using mercury thermometers, which are often exposed in either (ventilated) screens or sling psychrometers (Kent et al. 2007). Depending on the period, the thermometers are also placed in aspirated and whirling psychrometers. Term is eventually derived by applying the psychrometric formula. More information on VOS metadata and sensor types is given in Kent et al. (2007).
Several quality checks were performed at SWA prior to the merged SWA-ICOADS data usage, which permitted a quality index assignment to each observation. The procedure is briefly described in the following.
To ensure the maximum degree of reliance, the SWA-ICOADS dataset underwent a flagging procedure based on a verification scheme. Investigated and possibly corrected features included a verification of the geographical position and, if given, the direction of travel. A subsequent calculation of the ship speed allowed for a consistency check of the spatial distances between subsequent measurements. Distances exceeding individually defined tolerance levels were discarded from further analysis. Next, climatological threshold checks were performed for the parameters air temperature, dewpoint temperature, sea surface pressure, SST, and wind speed. These thresholds were defined on the basis of the ERA-Interim dataset (Dee et al. 2011). Temporal outliers and repetitive values were identified and removed. Subsequently, inner consistency checks were carried out, which also involved the identification of unphysical relations between different parameters. In the final step, spatial checks were applied to the aforementioned parameters to reject values that exceeded a maximum distance (individually defined for each parameter) to neighboring ship reports. The final outcome of all consistency checks was converted to internationally recognized quality flags [see standards defined by the World Meteorological Organization (WMO)].
Only ship records from the merged SWA-ICOADS database are selected for the subsequent analysis, in order to have a consistent, globally distributed dataset as the ground reference. This decision is legitimate due to the vast amount of available in situ measurements and prevents blending data originating from different kinds of platforms. The approach of ship measurements (in situ, as of now) as a ground comparison has been widely accepted and forms the basis of numerous other collocation analyses performed to date (e.g., Iwasaki and Kubota 2012; Jackson et al. 2006). To minimize their underlying error, only so-called special (e.g., research vessels) and merchant vessels are extracted. Compare WMO (2013) for more information on the ship categorizations. In addition, only elements that appear to be correct (WMO quality flag 1) are considered during further analysis.
For comparison, MTC analysis using only buoy records was performed, which did not significantly change the magnitudes of the decomposed random errors (not shown). This conclusion may not apply to systematic uncertainties, suggesting the inclusion of buoy records when it comes to HOAPS bias analysis.
A height correction of the in situ humidities to the HOAPS reference (10 m MSL, assuming neutral stability) is not performed, although this could be done by means of VOS metadata (WMO 2013). The correction is not performed, as the introduced uncertainty, owing to the intermittent violation of the equivalent neutral stability assumption, may mask or even exceed the expected improvement associated with the bias correction. To qualitatively assess the impact of height adjustments of different complexity on , an investigation of collocated ship-based values originating from matchups of a subset of SWA-ICOADS and ERA-Interim data between 1995 and 2004 was carried out. An average ship-based measurement height of 18 m was chosen (Kent et al. 2014). Over the Baltic Sea, which is representative of an extratropical ocean basin, the absolute correction to 10 m results in an increase of only 0.1 ± 0.2 g kg−1 [full stability correction; 0.1 ± 0.1 g kg−1 (neutral stability correction)], performed on the basis of a turbulence algorithm without SST correction (Bumke et al. 2014). This correction-induced increase lies within the uncertainty range suggested by Kent et al. (2014).
Indeed, Jackson et al. (2009) found an increase of by more than 0.2 g kg−1 when comparing inversion-corrected AMSU-A and SSM/I (AMMIc) retrievals to original and subsequently to height-corrected ICOADS ship-based . However, it led to an even larger bias of −0.29 g kg−1 (0.47 g kg−1) and slightly larger RMSE in comparison to uncorrected in situ measurements. This supports the argument that random variability is introduced by the height correction itself due to its dependency on the correction algorithm and associated (estimated) input bulk variables. Similar findings are published in Berry and Kent (2011), who argue that the height adjustment may be masked by the natural variability of (their Fig. 6). A respective noise increase is also presented in Prytherch et al. (2014). Kent and Berry (2005) show that the random error estimates are on average reduced by 8% (or 7%), if the full stability-dependent height correction is carried out (or assuming neutral stability). However, in comparison to the calculated total random error of 1.1 ± 0.1 g kg−1 published in Kent et al. (1999), this corresponds to an error reduction of just 0.1 g kg−1. This finding, combined with those presented in Jackson et al. (2009) and Berry and Kent (2011), justifies the conservation of the original in situ within this work.
b. Previous publications involving TC
The need for TC-based error estimates related to different geophysical datasets was first realized by Stoffelen (1998), who suggested its application for the calibration of the European Remote-Sensing Satellite-1 (ERS-1) scatterometer winds using wind speeds originating from the National Oceanic and Atmospheric Administration (NOAA) buoys and forecast model winds from the National Centers for Environmental Prediction (NCEP). Similarly, Caires and Sterl (2003) carried out TC analysis to validate significant wave height and wind speed fields from ERA-40 against altimeter measurements of buoys, ERS-1, and the Ocean Topography Experiment (TOPEX/Poseidon, NASA). Janssen et al. (2007) applied the TC method for wave height analyses. The introduction of the TC method into the field of satellite-based soil moisture research (Scipal et al. 2010) demonstrates the approach’s potential for a wide range of applications.
The strategy of this study to apply MTC analysis to HOAPS follows that of O’Carroll et al. (2008), who collocated data from the Advanced Along-Track Scanning Radiometer (AATSR), Advanced Microwave Scanning Radiometer for Earth Observing System (AMSR-E), and buoy SST to successively derive the standard deviation of error on each observation type.
c. MTC methodology
The satellite error decomposition based on MTC analysis relies on matchups of triplets involving both SSM/I and in situ records. These triplets are created on the basis of conventional double collocation in a first step, resulting in paired matchups of HOAPS and ship records between 60°S and 60°N. The collocated pairs are based on the so-called nearest neighbor approach; that is, HOAPS pixels are assigned to respective ship observations closest in time and space.
Ship records and up to three simultaneously available SSM/I instruments eventually allow for performing MTC analysis. A setup sketch of the triplets contributing to the MTC is shown in Fig. 1 (left panel). Triplets incorporating two independent ship measurements and one HOAPS pixel represent the first TC setup (left-hand side, V1 as of now), whereas a single ship record and two HOAPS pixels of independent SSM/I instruments form the second triplet structure (right-hand side, V2 as of now). In the case of V1, matchups incorporating two separate measurements obtained from the same vessel are excluded from further analysis. Although representing a major constraint in terms of amounts of available data, this approach ensures a complete independence of both in situ records. Figure 1 (right panel) shows the distribution of the overall V1 triplet amounts. Clearly, the in situ data density is highest in midlatitudinal, coastal regions.
Temporal and spatial collocation thresholds are set to 180 min and 50 km, respectively, following a statistical investigation by Kinzel (2013). For this, the author analyzed temporal decorrelation lengths of hourly ship between 1995 and 1997, exemplarily for R/V Polarstern. The analysis was confined to the midlatitudes, as these regions cover the tracks of extratropical storms, which are associated with the largest fluctuations of LHF-related parameters in time (e.g., Romanou et al. 2006). Specifically for , Kinzel (2013) obtained a temporal decorrelation scale of approximately 6 h. Assuming an average ship speed of 15–20 km h−1, this resulted in a spatial decorrelation scale of 90–120 km. These numbers are well above the chosen collocation thresholds.
As the representation of various atmospheric states should be the same for both V1 and V2, TC V2 triplets are considered only, if their ship record and either one of the participating HOAPS pixels contribute to V1 as well.
Triplets including outliers are rejected from further analysis on the basis of 3σ standard deviation tests. Ship measurements within V1 and V2 represent the in situ ground reference during this filtering process.
Subsequently, a bias correction with respect to the in situ source is performed. Its importance for TC analysis is highlighted in, for example, O’Carroll et al. (2008). It implies that the results of the error decomposition exclusively contain random uncertainties, as the systematic error is removed.
In preparation for the satellite error decomposition, the variances of differences between two data sources x and y, , are quantified, following O’Carroll et al. (2008):
That is, is given by the sum of the individual variances, corrected by the error covariance. In case the errors of x and y are not totally independent, the respective covariance terms differ from zero and hence impact the satellite error decomposition.
At this stage, the MTC approach requires the assumption of an error model underlying every data source, which allows for expressing each term shown in Eq. (1) as a sum of supposedly contributing random errors. The following error model setup for ships (s) and satellites (sat) is formulated as
Recall that , , and denote the random errors associated with the in situ measurement, the satellite retrieval model, and the sensor noise, respectively.
Given three independent data sources per TC version, Eq. (1) can be applied six times, requiring contributions of . For this, the relative contribution of each data source to does not need to be specified for the MTC application and is thus arbitrarily assigned to either Eq. (2a) or Eq. (2b) before utilizing Eq. (1).
Terms , , and are assumed to be satellite independent. Regarding , this is straightforward, as the exact same algorithm is applied to all SSM/I measurements to retrieve . Concerning , the SSM/I sensor sensitivities are shown in the aforementioned validation report (Fig. 2 in Fennig et al. 2013). The referenced figure does not indicate a dependency on the instruments. As to , the double and triple collocations rely on constant collocation criteria and the channel-dependent footprint sizes do not differ among the instruments.
Given the magnitude of on the left-hand side of Eqs. (3a)–(4c), the individual random errors can be quantified successively. To solve Eq. (4c) for , it is a prerequisite to calculate synthetically by means of an arbitrary daily HOAPS-S record of . For this, a random Gaussian noise with zero mean and a variance equal to the channel noise is simulated and subsequently added to the daily record. The assumption of Gaussian-distributed sensor sensitivities is widely accepted in literature and, for example, applied in Carsey (1992). Term represents the standard deviation of the difference between the original and the synthetically derived with a value of 0.3 g kg−1. As is a feature of the radiometer itself, it is independent of both platform and regime. Given , is derived via Eq. (3a). Subsequently, both and suffice as input to solve Eqs. (3b)–(4b) for . The resulting arithmetic mean of all four solutions is assumed to be the best estimate of . This is reasonable, as a separate analysis revealed that the standard deviations among the four solutions are in the order of 0.02–0.18 g kg−1, corresponding to only 1%–16% of (not shown).
Because of the independence of the individual uncertainty components, the retrieval error results from
which is dominated by due to the relatively small .
As expressed by Eqs. (3a)–(4c), cannot be isolated using a single TC approach, that is, a system of only three equations. This demonstrates the advantage of the applied MTC analysis regarding a successful decomposition of all random errors inherent to HOAPS .
In preparation for applying Eqs. (1)–(5), all triplets contributing to the MTC analysis are sorted in ascending order (with respect to “sat” in V1 and “sat1” in V2) and divided into 20 bins, respectively. All bins contain an equal amount of matchups, whereas the amount contributing to V1 differs from that of V2. Consequently, the bin widths are not constant, ranging from 0.37 to 1.86 g kg−1. The uncertainty decomposition using Eqs. (1)–(5), including the bias correction, is carried out separately for each bin. The resulting bin-dependent error magnitudes shown in sections 3a and 3b are arithmetic means of 10 individual error decomposition analyses, whereby 30% of bin data are randomly drawn to derive . More precisely, the decomposition is based on 18 005 triplets per TC version per bin.
3. Results of random error decomposition
First, the focus lies on the -dependent random uncertainty decomposition. To assess the regional dependency of the decomposed errors, a differentiation between tropics (0°–30°N/S) and extratropics (30°–60°N/S) is presented next. To investigate the temporal impact on the error statistics, winter (DJF), spring (MAM), summer (JJA), and autumn (SON) are considered separately. Furthermore, a multidimensional bias analysis approach helps to localize uncertainty hot spots in space.
a. -dependent random error decomposition
Figure 2 shows the result of the HOAPS error decomposition as a function of itself. The retrieval error (in red) converges to a minimum of approximately 0.7 g kg−1 for the smallest (relative uncertainty of 23%) and a global maximum partly exceeding 1.5 g kg−1 (relative uncertainty up to 13%) for between 12 and 17 g kg−1. Its global average value is given by 1.1 ± 0.3 g kg−1 (14% of relative uncertainty).
Because of the minor impact of on [Eq. (5)], the satellite’s retrieval model uncertainty (shown in blue) closely resembles throughout the range of and its mean is given by of 1.0 ± 0.3 g kg−1.
The error decomposition further reveals that , shown in black, fluctuates around 0.5 g kg−1 for below 10 g kg−1, above which a positive trend causes to maximize locally (0.7 g kg−1) within a regime of 14–17 g kg−1. Its average value is given by 0.5 ± 0.1 g kg−1, representing a relative uncertainty of 7%. In comparison to , the overall stability of is noticeable and was to be expected, as the collocation criteria were kept constant. However, its maximum for values of 14–17 g kg−1 indicates the largest uncertainty due to the collocation process and, consequently, the MTC approach. This humidity regime is confined to rather narrow latitudinal bands over the subtropical ocean basins and extratropical fronts. These strong gradients point out the limits of the chosen collocation criteria. They become smaller in the vicinity of the equator, as is reflected in declining for the largest .
Whereas 0.4 ± 0.1 g kg−1 represents the mean of (shown in yellow) for below 10 g kg−1, its average within (sub)tropical surface humidity regimes is 0.9 ± 0.1 g kg−1. In the inner tropics, it even exceeds . Overall, relative uncertainties range between 4% and 8%, emphasizing a linear relationship between in situ measurement uncertainties and the magnitude of . Its absolute average is given by 0.6 ± 0.3 g kg−1.
The increase of from 0.7 g kg−1 in low-humidity regimes up to 1.8 close to 14 g kg−1 and its subsequent gradual decay is also mirrored in Fig. 3, showing the bias of (HOAPS minus in situ) and its standard deviation as a function of HOAPS . Accordingly, it is evident that these standard deviations, which are shown as black bars, maximize for ranging between 12 and 14 g kg−1 (≈2.3 g kg−1), similar to Jackson et al. (2009, their Fig. 6b). The smallest spread of 1 g kg−1 occurs for of 3 g kg−1. As in Fig. 2, the spread of the bias clearly reduces to ≈1.7 g kg−1 (Fig. 3) in tropical regimes, implying a reduction in . The slope of the best fit shown in Fig. 3 is virtually zero, supporting the validity of the underlying retrieval model on a global scale. Yet, regime-dependent retrieval weaknesses exist. In contrast to in Fig. 2, the bars shown in Fig. 3 reflect the overall bin-dependent random uncertainty. Apart from the retrieval error , it also incorporates the effects of and . This can be considered a disadvantage in the representation of Fig. 3 and again strengthens the information content resulting from the MTC analysis (Fig. 2), which allows for a successive error decomposition. An accumulation of , , and for the critical range in Fig. 2 results in an overall random uncertainty of 2.2 g kg−1 (i.e., ), which closely resembles the observed equivalent of 2.3 g kg−1 in Fig. 3.
Bentamy et al. (2013) and Roberts et al. (2010) demonstrate that their SSM/I retrievals exhibit an explicit SST dependency. The authors show that an inclusion of SST into their neural network (Roberts et al. 2010) and multiparameter (Bentamy et al. 2013) approach considerably reduces the noise of differences. To determine the overall impact of SST on the retrieval error within the underlying work, an SST bias correction with respect to the in situ data was performed and the analyses presented in section 2c were repeated. The results indicate that is reduced by just 2% within the critical humidity regime between 12 and 17 g kg−1 (not shown), suggesting a multiparamater approach to be of secondary importance in this range. However, for small (3–5 g kg−1) and large (18–20 g kg−1) margins, the retrieval uncertainty is on average reduced by 9% and 5%, respectively. SST-related uncertainty hot spots (in an absolute sense) are found along the coasts of Western Australia and northern Chile (SST ≈ 20°C), where the total random uncertainty associated with SST is up to 0.2 g kg−1, that is, ≈10% of the underlying total uncertainty (not shown).
b. Seasonal and regional random error decomposition
The distribution of (Fig. 2) suggests that the underlying model for retrieving exhibits both strengths (small ) and weaknesses ( between 12 and 17 g kg−1), supporting the necessity of differentiating between different surface moisture and hence geographical regimes when it comes to error decomposition. To highlight regional error dependencies, Fig. 4 exemplarily confronts time series of decomposed errors during boreal winter (DJF) within the extratropics (30°–60°N/S, left panel) and the tropics (0°–30°N/S, right panel). Table 1 summarizes all decomposed error magnitudes, along with their standard deviation and relative contributions (to the basin-mean ) as a function of region and season.
Focusing on the extratropics first (left panel), the average value of is 0.8 ± 0.1 g kg−1 (16% relative error). This order of magnitude is expected for an average of 5.2 ± 0.4 g kg−1 (Fig. 2). The overall uncertainty introduced by (by ) is given by 0.3 ± 0.1 g kg−1 (5% relative error) [0.6 ± 0.1 g kg−1 (11% relative error)]. A closer look at the different seasons for extratropical latitudes (Table 1) indicates that retrieval errors maximize during boreal autumn (SON, 1.1 ± 0.1 g kg−1, yet only 13% relative uncertainty). Term associated with the largest average during boreal summer months (JJA, 10.0 g kg−1) remains 0.1 g kg−1 below the SON average. According to the constant increase in retrieval errors with increasing , as illustrated in Fig. 2, this was not to be expected. Strong positive outliers in boreal autumn , specifically in 1997 during the evolving El Niño event, may explain this feature (see below for explanation). As also suggested by Fig. 2, maximizes during boreal summer (0.7 g kg−1), along with the temporal maximum in the course of a year. The local reduction in for values of 9–10 g kg−1, as seen in Fig. 2, is well represented in the seasonal analysis. Hence, has a maximum of 0.7 g kg−1 in SON, whereas 0.6 g kg−1 is representative of extratropical boreal summer months.
Comparing extratropical error characteristics to the tropical counterpart (right panel) clearly demonstrates the retrieval error dependency on boundary layer moisture content. During boreal winter (Fig. 4, right panel), the average tropical retrieval uncertainty is given by 1.6 ± 0.2 g kg−1 (11% relative error), where the average of is 13.9 ± 0.8 g kg−1. This humidity range corresponds to the moisture regime of the largest retrieval discrepancies (Fig. 2) and explains why is 0.2–0.4 g kg−1 larger in comparison to the remaining seasons. During boreal winter, in situ (collocation) uncertainties are on average 0.8 g kg−1 (0.1 g kg−1) larger in comparison to the extratropical counterpart, yet having relative contributions of only 7% (5%).
The regional confrontation of decomposed errors shown in Fig. 4 and Table 1 clearly mirrors the error dependency on the regime. In case of tropical latitudes, this goes along with interannual variability in error magnitudes, due to their pronounced sensitivity to , as is illustrated in Fig. 2.
In general, outliers within seasonal and regional time series could possibly be linked to strong El Niño and La Niña events, which are identified by means of the oceanic Niño index (Climate Prediction Center, NOAA), representing SST anomalies within the Niño-3.4 region (5°S–5°N, 170°W–120°W). Such a link may exist for the tropical boreal autumn in 2007 ( 0.4 g kg−1 larger than seasonal average, not shown), associated with a moderate La Niña event. Anomalously low SSTs within the Niño-3.4 region, which are associated with these events, were already persistent during the preceding 8 months. This supports the hypothesis that anomaly patterns may have propagated toward the Atlantic Ocean (where the in situ data density is highest) via atmospheric planetary Rossby waves and may have caused a shift into humidity regimes associated with larger retrieval uncertainties. This mechanism may also be attributed to the tropical boreal winter (1998) and the extratropical boreal autumn (1997) ( being 0.2 g kg−1 larger than the seasonal averages), in line with the strong El Niño event established several months earlier. The effects of El Niño–Southern Oscillation (ENSO) teleconnections on air–sea interaction on a global scale have been investigated by Alexander et al. (2002), for example.
c. Regional random uncertainty hot spots
Figures 2–4 demonstrate the behavior of the decomposed errors as a function of only. To localize true hot spots of in space, however, the -dependent error magnitudes shown in Fig. 2 cannot simply be transferred to a global map, knowing only the average near-surface humidity distribution. The uncertainty pattern rather depends on the dominating sources of uncertainty, which are introduced by further atmospheric state variables. A specific region may, for example, be exposed to prevailing wind speeds, which enhance or dampen the illustrated in Fig. 2.
To overcome this issue and hence capture the overall random uncertainty as a function of the simultaneous atmospheric state, the analysis shown in Fig. 3 is expanded by deriving biases as a function of wind speed, SST, and water vapor path by means of double collocation (not shown). These three parameters are available from HOAPS and allow a distinction of different atmospheric regimes. As in Fig. 3, this supplemental analysis results in bin-specific biases. Given all four one-dimensional bias analyses, a four-dimensional bias lookup table is constructed, where the dimensions correspond to , wind speed, SST, and water vapor path. Figure 5 (left panel) shows a sketch of this table in three-dimensional space. Subsequently, all instantaneous biases resulting from the double collocation procedure are assigned to one of the 204 = 160 000 bins. If fewer than 100 bias values are assigned to a bin, then its content is considered nonrepresentative and an interpolation is carried out along all dimensions. The overall random uncertainty for every bin (equivalent to in Fig. 2) is defined as the spread of all instantaneous biases underlying every bin. In the last step, these random uncertainties in are corrected for the relative contributions of and (bin dependent, according to Fig. 2) to exclusively focus on the random retrieval error . Applying all instantaneous HOAPS data to this four-dimensional random retrieval uncertainty table leads to a global random retrieval uncertainty distribution, which is shown in Fig. 5 (right panel) for 1995–2008. Its area-weighted global average is 0.82 g kg−1.
As can be seen, the largest retrieval uncertainties (with the exception of the global maximum off the Arabian Peninsula and India) are found along subtropical bands of both hemispheres, where they reach values up to 1.5 g kg−1. More specifically, the maxima are located in regimes characterized by a mixture of trade and shallow cumulus with thin cirrus (Rossow et al. 2005; Oreopoulos and Rossow 2011), which seem to introduce an additional uncertainty within the retrieval. At the same time, the average random retrieval error of reduces toward the tropics, as is reflected in Figs. 2 and 3. Overall, the magnitudes are consistent with the total random uncertainties resulting from the error decomposition (Fig. 2). This suggests that itself has the largest influence on -related , whereas the impacts of wind speed, SST, and water vapor path are of secondary order on a climatological scale.
The global random uncertainty maximum within the Arabian Sea (up to 1.7 g kg−1) is special, in as much as concurrent mean wind speeds remain below 5 m s−1 throughout most of the year (apart from boreal summer months, where monsoon-related wind speeds often exceed 12 m s−1). Further analyses revealed that the spread of the bias as a function of wind speed is largest for the smallest wind speeds. This may be due to an enhanced decoupling of the vertical atmospheric column, introducing additional difficulties in the retrieval, which could explain the amplification of the -related in this region.
Summing up, the error characteristics show a clear regional (Figs. 2 and 5, right panel) and seasonal (Fig. 4; Table 1) dependency. Total uncertainties are especially large in subtropical latitudes (Fig. 5, right panel), particularly during boreal winter (DJF), when remains in a near-surface humidity range associated with the largest retrieval uncertainties (12–17 g kg−1).
a. retrieval uncertainties
Figures 2–4 suggest that the retrieval exhibits the largest uncertainties for particular atmospheric and oceanic conditions. Possible explanations for this retrieval performance will be discussed in the following.
Note that all cited publications, including RMSE estimates of retrievals, neither explicitly perform a bias correction with respect to the in situ reference, nor have and been removed. In consequence, the resulting random uncertainty estimates () exceed the true random retrieval error (), which remains unknown. This highlights the benefit of the chosen MTC approach.
Numerous retrievals have been presented to date and intercomparisons have been carried out in the past. The single-parameter, multilinear approach of Bentamy et al. (2003), which is used in HOAPS, considerably improved the accuracy of in comparison to former attempts presented in, for example, Liu (1986). The latter took precipitable water as a proxy for the retrieval. Revised regression coefficients within Bentamy et al. (2003), based on a more representative in situ dataset, led to an average reduction in both bias (15%) and its RMSE (≈20%), favoring its successful implementation and/or tuning in further studies (e.g., Andersson et al. 2010; Jackson et al. 2009; Kubota and Hihara 2008).
A correlation coefficient of 0.96 between the integrated water vapor content (w) and the boundary layer humidity contribution (up to 500 m MSL) shown in Schulz et al. (1993) generally justifies the assumption of an underlying linear relationship between w and . However, this linear relationship is challenged by Bourras (2006) (which in part also applies to the algorithm of HOAPS), who elucidates two cases of vertical profiles, where this linear dependency breaks down and in consequence introduces large errors in . On the one hand, his considerations target the decoupling of the boundary layer moisture from higher atmospheric water vapor contents, which may be identified by means of local minima of vertical correlation profiles between both parameters. On the other hand, Bourras (2006) specifically addresses regions of deep convection and associated retrieval deficiencies (see also Bentamy et al. 2013), where the assumption of most water vapor being confined to the boundary layer is violated.
To overcome such retrieval errors, an inclusion of nonlinear terms within the retrieval algorithms—as presented in, for example, Jackson et al. (2009)—can reduce the RMSE between remotely sensed and in situ records. Specifically, their AMMI retrieval incorporates a quadratic term for the 52.8-GHz channel (not available in HOAPS). This channel not only provides somewhat more direct information on the lower troposphere but its quadratic weighting also allows for better describing the nonlinear relationship between lower-tropospheric temperatures and water vapor.
Furthermore, Bentamy et al. (2013) argue that single-parameter, multilinear regressions may be too simple to capture the underlying physical mechanisms. The authors show that seems to exhibit an explicit SST dependency when investigating biases between the NOC Southhampton, version 2.0 (NOCv2.0; Berry and Kent 2011), and SSM/I (their Fig. 1). Including an SST, as well as a stability dependency ( minus SST), in their retrieval considerably reduces the noise (by up to 50%) of daily differences (in situ minus SSM/I) at 0.25° resolution on a global scale. The main discrepancies are confined to extratropical southern latitudes. Large-scale biases (dry tropics, wet subtropics), which were evident in former retrievals, remain marginal within their multiparameter approach.
Roberts et al. (2010) also pick up the influence of SST on the representativeness of the SSM/I retrieval output for and present a nonlinear approach on the basis of a neural network. Applying SST as a first-guess input parameter to the retrieval and accounting for the regime-dependent effect of high cloud liquid water (CLW) on , the authors demonstrate that biases (RMSE) of are reduced by 45% (27%) in comparison to, for example, Bentamy et al. (2003, their Fig. 5). The remaining bias (RMSE) is given by 0.16 g kg−1 (1.32 g kg−1). Regarding the RMSE, its magnitude agrees with the average derived in this work (1.29 g kg−1). Especially for very high CLW, the latter tends to effectively remove low-level humidity information from the satellite signal, which applies to most, yet not all compared satellite datasets. The largest discrepancies between both approaches are evident for negative lapse rates (i.e., inversions) along with elevated moisture above 900 hPa. Similar conclusions involving the impact of inversions on are drawn in Jackson et al. (2006, their Fig. 3). Given traditional linear regression models, moist air masses aloft feign large boundary moistures and thus introduce large errors in and consequently . Roberts et al. (2010) present two case studies, for which the SST boundary condition is able to successfully distinguish inversion profiles from near-neutral or unstable stratifications. Regimes with damped SST associated with cold surface currents or upwelling regimes along with retrieval issues due to stratocumulus clouds (see Jackson et al. 2009; Smith et al. 2011) may be more effectively interpreted by their sophisticated retrieval. Furthermore, the authors demonstrate that warm SST in conjunction with high-level subsidence and hence little moisture (as frequently observed over the North Pacific during boreal summer within the descending branch of the Hadley cell) do not necessarily lead to large biases in , given their approach.
To further quantify retrieval weaknesses, Iwasaki and Kubota (2012) developed two retrievals for estimating using Tropical Rainfall Measuring Mission Microwave Imager (TMI) data in comparison to ICOADS moored buoy data between 2003 and 2006. The essential difference between both linear retrievals was the amount of contributing TMI channels and thus their complexity. The authors show that their products yield a smaller RMSE specifically in the tropics, compared to those published in Schlüssel et al. (1995) (SSM/I), Kubota and Hihara (2008) (AMSR-E), and Schlüssel and Albert (2001) (TMI). The authors hold the inclusion of the 85-GHz polarized radiation responsible, which is not included within the model of Bentamy et al. (2003) and hence HOAPS. This finding may be responsible for the negative bias and the largest RMSE within the subtropical high pressure systems, which falls into the critical range of 12–17 g kg−1 (see Fig. 3). Specifically for the subtropical highs, where CLW and rain rates remain small, the 85-GHz channels may include valuable boundary layer humidity information. However, one needs to keep in mind that their results are representative of only tropical regimes (due to the TMI orbit), in contrast to the approach of Bentamy et al. (2003).
Because of inherent deficiencies in single-sensor retrievals (such as Bentamy et al. 2003), Jackson et al. (2006, 2009) elucidate the advantage of a multisensor approach, which, apart from SSM/I, utilizes temperature and humidity sounders (AMSU-A and SSM/T-2, respectively). Aiming at better evaluating the lower-tropospheric temperature and moisture characteristics, the authors reduce the RMSE differences (in comparison to ICOADS VOS and buoy measurements) by up to 0.4 g kg−1, compared to single-sensor retrievals. This approach introduces additional information provided by the microwave sounders for ranges of 16–20 g kg−1 and regimes of very low moisture content.
Prytherch et al. (2014) recently published results of an intercomparison involving different SSM/I-based datasets and identified considerable discrepancies among the data records, where regional variations exceed 1 g kg−1 on an annual basis, despite relying on the same retrieval algorithm. Hence, differences among HOAPS, GSSTF3, and IFREMER, all of which rely on the algorithm of Bentamy et al. (2003), are bound to originate from varying data processing routines, intercalibration techniques, and quality controls. The different handling of hydrometeor contamination of the signal and humidity inversions are two procedures within these filtering routines, which introduce departures among the resulting . In contrast to IFREMER, for example, HOAPS includes a humidity inversion correction, which is possibly the reason for the former being low biased within regimes of the smallest absolute (Fig. 9b in Prytherch et al. 2014). On the other hand, the effects of intersatellite calibrations on the may explain the discrepancies among based on HOAPS (intercalibration performed) and IFREMER (not subject to intercalibration).
b. In situ uncertainties
Kent and Berry (2005) recall that VOS observations contain significant uncertainties and are of variable quality. They estimated random measurement errors in VOS between 1970 and 2002 using a semivariogram approach, based on the ICOADS dataset (Woodruff et al. 1998). Figure 1d in Kent and Berry (2005) shows global maps of the uncorrelated uncertainty component of averaged over the whole time frame. The spatial distribution of random variability components ranges between 0.7 ± 0.1 g kg−1 (extratropical North Atlantic) and 1.7 ± 0.4 g kg−1 (near the Arabian Peninsula). A further investigation of latitudinal error dependencies in Kent and Berry (2005) indicates that the random error component constitutes the largest part of the total observational error within tropical regions. In contrast, the sampling error becomes considerably more important within the extratropics. These results imply that the random error component increases from larger (small ) to lower (large ) latitudes, as is also seen within Fig. 2, with the exception of the inner tropics (section 3).
The estimates published in Kent and Berry (2005) for the lower boundary closely resemble the in situ errors shown in Fig. 2, given that most of the matchups below 10 g kg−1 are constrained to extratropical northern latitudes along major shipping lanes (Bentamy et al. 2003). As discussed in section 3a, moister regimes are subject to larger random in situ errors, which agrees with results published in Kent and Berry (2005). Yet, their average random error in is 1.1 ± 0.1 g kg−1, which is ≈0.5 g kg−1 larger than the average estimate in this study (0.6 ± 0.3 g kg−1). This discrepancy may be due to the strict filtering of nonappropriate ship records prior to the MTC analysis. Furthermore, the amount of contributing matchups displayed in Kent and Berry (2005) is considerably lower than the collocated triplets forming the basis of this work. Additionally, Fig. 1 in Kent and Berry (2005) includes 32 years of in situ data. The in situ quality in early years is likely to have been below today’s measurement accuracies and particularly below the quality standard chosen for this study.
Kent and Taylor (1996) and Berry et al. (2004), among others, investigated the impact of solar radiation on the uncertainty of ship-based . In this context, Berry et al. (2004) present a correction for radiative heating errors on the basis of an analytical solution of the heat budget for an idealized ship. They found an RMSE reduction of the air–sea temperature difference of 30% to ≈0.5°C, eventually reducing the RMSE of .
The uncertainties introduced by different hygrometer types are explored by Kent et al. (1993) in the framework of the VOS Special Observing Project North Atlantic (VSOP-NA), who suggest applying an empirical correction to humidity measurements using marine screens. The authors argue that the latter tend to be high biased in comparison to psychrometers, presumably due to their poor ventilation. Such a correction is presented by Kent and Taylor (1995) for screen-based dewpoint temperatures. Screen humidity corrections are also applied within Kent et al. (2014) among an intercomparison study of in situ and reanalysis .
Jackson et al. (2009) also focus on hygrometer- and radiation-induced uncertainties, based on ICOADS observations and AMMIc retrievals. However, the authors conclude that both error sources contribute less than 0.05 g kg−1 to the overall uncertainty, suggesting their input with respect to the total error budget to be negligible.
c. Applied methodology
Equations (3a)–(4c) incorporate an error contribution associated with the collocation procedure (). As this work’s definition of is related only to spatial and temporal mismatches, it is not specifically differentiated between used in Eqs. (3a) and (4c). However, it is likely that an additional random point-to-area uncertainty (error of representativeness ) is inherent in the MTC matchups. This is accounted for, inasmuch as derived in Eq. (3a) is supplemented by a contribution. However, is not explicitly resolved, as this inhibits a complete error decomposition due to too many unknowns. Instead, the calculated [Eq. (3a) and hence Eqs. (3b)–(4b)] remains slightly larger than in theory, whereas becomes negligibly smaller. Although a quantification of is not possible, the derived decorrelation length scale in Kinzel (2013) considerably exceeds the diameter of a SSM/I footprint, which is the scale of interest regarding the point-to-area issue. It is therefore concluded that lies within the uncertainty of and is therefore negligible in comparison to the overall variances of differences (see note on this in O’Carroll et al. 2008). Equipping in situ data sources with random uncertainty estimates (prior to using them in context of retrieval validation analysis) is strongly recommended, as this would allow for explicitly deriving .
One could also argue that the applied MTC method does not yield robust results for the critical regime, which is subject to limited amounts of triplets due to narrow shipping lanes in the subtropical ocean basins. To quantify the robustness of the variances, Scipal et al. (2010) estimated the impact of constraining the TC analysis to small subsets of simulated time series subject to random noise. Results indicate that fewer than 100 matchups (=N) lead to systematic uncertainties of up to 5%, which does not influence the present analysis. Zwieback et al. (2012), however, argued that the relative error—that is, the standard error relative to the quantity of interest—exceeds 22% for N = 100, assuming all error variances to be of similar size and the underlying noise to be normally distributed. If their Eq. (29) holds, at least 2000 matchups are necessary to restrict the relative error contribution to 5%. For a single year on a seasonal basis, this may imply a reduced reliability of the MTC approach, as the tropical data coverage may temporarily fall below this target.
The chosen collocation criteria are identical to those applied by, for example, Jackson et al. (2006), who also investigated using microwave satellite observations. However, modifications of the collocation criteria underlying this work were also carried out to treat the temporal deviation more strictly, removing collocated pairs where exceeded 60 min. Specifically for the critical regime of 12–17 g kg−1, the results do not indicate a reduction of the satellite retrieval error. Instead, the temporal restriction leaves even fewer matchups in the already poorly sampled regions, which further increases the random uncertainty of the variance estimates (Scipal et al. 2010). It is therefore concluded that the originally chosen collocation thresholds of 180 min and 50 km are adequate. Yet, large humidity gradients may occur along midlatitudinal shipping routes, associated with frontal systems. However, these do not distort the error decomposition itself, as such outliers have been removed from the analyses (see section 2c). A comparison of the error bar magnitudes shown in Fig. 3 with in Fig. 2 yields absolute differences in the order of only 5%–10% throughout the whole range. Keeping in mind that the temporal threshold for matchups shown in Fig. 3 is only ±1 h, this further supports the assumption that ±3 h is a reasonable temporal decorrelation scale. In general, the decorrelation time scale cannot be chosen arbitrarily small in preparation for the MTC analysis, because the temporal difference of SSM/I overpasses of two different instruments is in the order of 2–3 h. This depends on the combination of SSM/I instruments (e.g., Andersson et al. 2010). Consequently, TC V2 and hence the MTC analysis would often not be realizable if the temporal thresholds were set to, for example, ±1 h.
5. Conclusions and outlook
Latent heat fluxes (LHF) play a key role in the context of energy exchange between ocean and atmosphere and thus impact the global energy cycle. Because of insufficient spatial sampling of in situ measurements, remote sensing represents an indispensable technique to monitor parameterized LHF in high resolution. However, their uncertainty estimates, which find expression in the satellite’s retrieval error , are not sufficiently quantified to date, which complicates their use in the context of model validation, trend, and variability analyses, as well as process studies.
For the near-surface specific humidity , which represents a key geophysical input parameter to parameterized LHF, the aim was to decompose overall satellite-based random uncertainties into individual components to isolate the desired .
In this context, it was shown that the ordinary TC approach can be (and needs to be) extended by means of a novel multiple TC (MTC) procedure, serving as a powerful tool to distinguish satellite-based random uncertainties associated with the underlying model () and sensor noise () from contributions of in situ records () and collocation (). The MTC analysis was specifically performed for the HOAPS-3.2 on a pixel-level basis, based on an extensive matchup database of SWA-ICOADS ship records for the time period of 1995–2008.
The robust results of the MTC analysis indicate that the random retrieval error is on average 1.1 ± 0.3 g kg−1, which is supplemented by averages of (0.5 ± 0.1 g kg−1) and (0.5 ± 0.3 g kg−1). Term was derived synthetically (0.3 g kg−1). A -dependent analysis shows that the retrieval has the largest difficulties in the regime of 12–17 g kg−1, where exceeds 1.5 g kg−1. The largest (0.7 g kg−1) also falls into this range, which is representative of the subtropical domain encompassing the global oceans. On the contrary, increases rather linearly with , taking on values between 0.2 and 1.2 g kg−1. Local analysis on a global scale reveals absolute uncertainty maxima of approximately 1.7 g kg−1 off the Arabian Peninsula, where both and wind speed remain in ranges susceptible for large random errors (small wind speeds coupled to rather large, yet not tropical ).
Despite random in situ measurement errors and possible deficits underlying the collocation approach, the results suggest that the largest random uncertainties originate from the retrieval itself, which in the case of HOAPS-3.2 is based on the linear, single-parameter regression retrieval by Bentamy et al. (2003). The MTC-based findings demonstrate how both regime-dependent retrieval uncertainties and in situ measurement issues can be effectively isolated. This will prove very helpful in further advancing the satellite-based retrieval to meet the desired quality requirements. As discussed in section 4, HOAPS uncertainties could possibly be reduced by introducing new retrieval algorithms, which could rely on a multiparameter approach and/or incorporate nonlinear regression terms.
Similar to HOAPS-3.2, previous retrievals have mostly been derived from regression analysis using training datasets of and in situ point measurements. This implies that respective RMSE estimates typically include both and and thus inhibit an explicit determination of the random retrieval uncertainty. This again emphasizes the benefit of the uncertainty decomposition approach. Assigning random uncertainty estimates to all contributing data sources, as done within this work, allows for evaluating the satellite retrieval precision. If only was given, then a quantitative comparison between retrieval and in situ random uncertainties to assess retrieval constraints cannot be carried out.
A step toward higher-quality certainly also involves a more comprehensive in situ validation dataset, in which all humidities are equally well represented. This task will be challenging, as the number of VOS is continuously declining (see Kent et al. 2014). Additionally, the ICOADS dataset does not contain call signs after December 2007 (Kent et al. 2013), which further hinders the validation of remotely sensed parameters, as platforms producing systematic measurement errors may no longer be excluded from error analyses.
Future work aims at quantifying of satellite-based wind speed and SST. Respective findings will help to derive of the remaining LHF-related bulk parameters and hence the retrieval uncertainty of HOAPS evaporation.
To better assess the quality of the satellite-based datasets, Prytherch et al. (2014) furthermore argue that gridbox-based uncertainty estimates would be extremely beneficial, which are not available to date. This approach is currently undertaken at DWD and the first results will be published in the near future. As a total error assessment involves the investigation of random error contributions, the presented work can therefore be understood as a first step toward this effort. A full error characterization of all HOAPS freshwater flux–related parameters will be implemented in the next official HOAPS climatology, which will be released in late 2016.
J. K. was funded by the German Science Foundation (DFG). Funding for K. F., M. S. and A. A was covered by EUMETSAT. The funding for the development and implementation of the collocation software was provided by the German Meteorological Service (DWD). The HOAPS-3.2 data were kindly provided by EUMETSAT’s Satellite Application Facility on Climate Monitoring (CM SAF). SWA-ICOADS data were gratefully obtained from SWA (DWD).